identity column

  • i know to use identity data type to add information which should automatically increment.

    but, can some one explain how should we work with the below requirement.

    there is a company with branches in china and japan

    the empid should be like MYCHIN123456 ( first 6 alphabets and next 6 digits )

    the alphabets shows based on the location for JAPAN its "JAPA"

    and the remaining order is differnt for each country

    like china starts @ 10000

    and japan strats @ 50000

    so when ever a new emp joins in china

    it shold be coming as MYCHIN10000 and for japan MYJAPA500000

    anyhow this complete emp id should be unique for each person.

  • You have two options. One is writing your own function which generates new ID's.

    The second option which I would prefer is using two columns one for location and one for the numeric ID. If necessary you can concatenate both columns when displaying them.

    [font="Verdana"]Markus Bohse[/font]

  • MarkusB (12/17/2008)


    You have two options. One is writing your own function which generates new ID's.

    The second option which I would prefer is using two columns one for location and one for the numeric ID. If necessary you can concatenate both columns when displaying them.

    I would agree with Markus that the second option would be preferable with a composite unique index. Of course you still need to do option 1 if you need to partition the numeric portion.

    I would question why you need to partition the numeric portion if you are including the location in the id or unique index? Why can't you have JAPA00001 and MYCHIN000001?

  • As already stated, this is a very bad idea because of several reasons and I might be missing a couple...

    1. First of all, by definition, a "column" in a table is supposed to represent one and only one attribute of the rows in the table. The method you are trying to use combines the two attributes of "location" and "row identifier".

    2. If, as I've seen so often happen, the designed scope of the row identifiers (ie, range of numbers for each location) is exceeded, either you're dead in the water or additional special handling to include another range of row identifiers must be designed and deployed.

    3. In general, it makes the handling of the data a huge pain because, as you're finding out, it's difficult to automatically increment such mixed numbers.

    4. Despite your best efforts, you will end up with gaps in the sequence even if you don't use this method. Don't ever count on their not being gaps except in very instances where the table is never added to such as a Tally, Numbers, or Calendar table or, possibly, a predefined serial number sequence table (which is just plain ineffecient).

    The best and correct way to do this would be to, indeed, have two separate columns for "location" and "row identifier". From there, you could have a "calculated" or "computed" column in the table to concatenate the two values.

    However, if you are unable to convince the designers that combining location and row identifier in a single column is a woeful mistake, then do the next best thing... create a separate table for each location with a constraint on the LOCATION column. That way, you can create an updateable view known as a "partitioned view" to do Inserts, Updates, Deletes, and Selects through as if all the tables were a single table. There are actually some benefits to the maintenance of such tables and partitioned views like smaller indexes to maintain, etc.

    In each of the tables belonging to the partitioned view, you would seed the IDENTITY column with the starting sequence number for the given location and, of course, location would be a separate column. The only place where you need to combine the IDENTITY column and the Location column, would be in the actual view itself because the view would, in fact, use the correct indexes on those two columns in any lookups you may play against the combination of the two.

    Again, I think this is a mistake, but it is sometimes very difficult and even career threatening to convince people like designers (who are supposed to know better), that they've made a terrible mistake. 😛

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Oh yeah... almost forgot... the method I described that uses multiple tables, a constraint, and a view to make a "partitioned" view, is being deprecated. If you have no SQL Server 2000 installations to support in this manner, consider using a "partitioned table", instead. The problem with that is that it won't work on the Standard Edition of SQL Server... you need the Enterprise Edition for production.

    I guess that's reason #5 for NOT doing it that way. 😉

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • There is another option, though a very old option and not very pretty. Only use this if you can't convince the designers to use 2 separate columns in the main table.

    Do NOT use Identity in the main table. Instead, have an ID column which stores the IDs for all records regardless.

    Create a new table called tblCounter. Have two integer columns. JapanID and ChinaID. Then code your stored procedure or application code (more likely the later) so that any inserts fetches the most recently used ID from the appropriate column, based on location, from tblCounter and increments the ID by 1. When the code inserts the record into the main table, it'll prefix the ID with the appropriate location code. Then it'll update tblCounter with the incremented integer number (sans location prefix).

    It's a little bulky, but gets the job done without using depreciated coding tricks.

    Brandie Tarvin, MCITP Database AdministratorLiveJournal Blog: http://brandietarvin.livejournal.com/[/url]On LinkedIn!, Google+, and Twitter.Freelance Writer: ShadowrunLatchkeys: Nevermore, Latchkeys: The Bootleg War, and Latchkeys: Roscoes in the Night are now available on Nook and Kindle.

  • To expand on Brandie's option, this solution is kind of ugly and you can only insert 1 Emp row at a time. You would also have to do all inserts with some extra code or in a stored procedure (the preferred way). The table would look like this:

    CREATE TABLE NextEmpTbl

    ( NextJapanID INT

    , NextChinaID INT

    )

    INSERT INTO NextEmpTbl

    ( NextJapanID, NextChinaID )

    SELECT 1000, 50000

    This gives 40,000 ID's that can be used for Japan. If you need more then the gap between them should be much larger.

    The stored procedure would have logic in it like this:

    CREATE PROCEDURE InsertEmp

    @CountryCode VARCHAR(5) -- Pass In either 'JAPA' or 'CHIN'

    DECLARE

    @NextID INT

    , @NewEmp VARCHAR(10)

    -- This type of update ensures no-one else can get the number

    -- you are getting.

    IF @CountryCode = 'JAPA'

    UPDATE NextEmpTbl SET @NextID = NextJapanID

    , NextJapanID = NextJapanID + 1

    ELSE

    UPDATE NextEmpTbl SET @NextID = NextChinaID

    , NextChinaID = NextChinaID + 1

    SET @NewEmp = @CountryCode

    + RIGHT('000000' + CONVERT(VARCHAR, @NewID))

    -- You can now use @NewEmp for the ID.

    -- It's ugly and you can only insert one at a time this way.

    Todd Fifield

  • I've really got to respectfully disagree... the use of a "sequence" table of any kind will become a major hotspot, a possible source of a huge number of deadlocks, and will generally prevent any and all code from being set-based without the presence of temporary working tables and a bunch of computational hooie to get the ID's out of the sequence table. Having two columns in the same table is also and absolutely the wrong way to do this.

    The correct way to do this, if it must be done, is to use a "Partitioned" table. No if's, ands, or but's, it's the only way to go with something like this.

    Just say "NO" to sequence tables... they are a form of "Death by SQL". We had one in our old company and, until I made everyone toe-the-line and wrote a nasty code change, it was the single source of an average of 640 deadlocks per day with spikes up to 4000 deadlocks in a single day.

    If you never trust me on anything else, trust me on this! 😉

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff,

    As I pointed out the sequence table is ugly and only works for a single insert. It should only be done when there are interactive inserts done by users. It's useless, as you pointed out, if any set based inserts are going on.

    There is a major POS system on the market that I support that uses this type of insert logic and it performs quite well. One insert per cash register per customer standing in line. There was another one where only 1 person in the company did invoicing and it worked well.

    The application in question here was adding employees. That seems to be something that would be done one employee at a time.

    Other than these types of applications I totally agree with you. It's ugly, not set based and prone to dead locks.

    Todd Fifield

  • For what it's worth - I'd also make sure that this "user ID" is NOT your actual primary key. Because it carries "user significance", and like Jeff pointed out, because this fields is a violation of normalization by combining multiple data elements, this thing is BOUND to change in the future (Just wait until they want to have MYNORTHCHIN and MYSOUTHCHIN, and "split" the MYCHIN group in half). Do NOT make this your actual PK, or you will have a maintenance nightmare on your hand.

    Let them have any user ID they want if you can't convince them how stupid it is, but for goodness sake, just keep is as a surrogate key. Manage your own key, and insulate it from any stupidity they might come up with.....

    ----------------------------------------------------------------------------------
    Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?

  • tfifield (1/3/2009)


    Jeff,

    As I pointed out the sequence table is ugly and only works for a single insert. It should only be done when there are interactive inserts done by users. It's useless, as you pointed out, if any set based inserts are going on.

    There is a major POS system on the market that I support that uses this type of insert logic and it performs quite well. One insert per cash register per customer standing in line. There was another one where only 1 person in the company did invoicing and it worked well.

    The application in question here was adding employees. That seems to be something that would be done one employee at a time.

    Other than these types of applications I totally agree with you. It's ugly, not set based and prone to dead locks.

    Todd Fifield

    Ah... but as I said... you CAN do batch inserts with a Sequence table instead of RBAR... it's just a pain compared to the straight forward method of using an IDENTITY column.

    The other thing that I'm trying to nicely point out is that any code that uses a sequence table where you have to provide columnar inputs to a single row is really asking for trouble because now you have an even more concentrated hotspot and requires the very definition of the table to change if you add more "countries". Whether those are the cards that have been dealt to you or not, I don't want anyone to think that's even close to the right way of doing it even if a POS 3rd party bit of software is involved.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Matt and Jeff,

    The points are well taken.

    Todd Fifield

Viewing 12 posts - 1 through 11 (of 11 total)

You must be logged in to reply to this topic. Login to reply