T-Sql rant

  • Jeff Moden (3/20/2009)


    Michael Valentine Jones (3/20/2009)


    Not that I would ever recommend using this to get the next ID for anything I was designing, but when someone has already put something like this is place, it can really be a problem.

    Agreed... but lemme ask, what would you use if, for some strange reason or in the face of replication, you simply couldn't use an IDENTITY column?

    When you say replication, I assume you mean two-way merge replication. I've never done that, but I thought using a uniqueidentifier was the standard way to handle that. I have had a a few situations where I have used both identity and uniquidentifier columns on a table. I did this when we needed to be able to distribute master lookup tables from a central source to target databases. The uniquidentifier is used to maintain a unique key for distribution, while the identity is used within the local transactional system as an efficient foreign key for transactional tables, and can be different from the master. Although the uniquidentifier column is an extra candidate key, typically the lookup tables are small, so that is not a problem. The updates of the data from the central source is not handled by SQL Server replication; we develop our own code for that.

    As for "some strange reason", it's hard for me to comment on that without a little more background on the reason why an identity could not be used. :unsure:

    I have been using identity as the standard surrogate primary key generator on systems for at least 10 years, and I can't say that I have ever had a problem with it.

    I have seen plenty of cases where using "nextid" tables caused a problem. Even when they are well done and don't result in deadlocks or blocks , I think they are still a far less scalable solution than using identity and just generally far more trouble.

  • You get no argument by me... I'd much rather use the highly scalable high performance method of using IDENTITY than going through all of the headaches of using the likes of a sequence table or even GUIDs. Heh... have you ever tried to type a GUID for troubleshooting purposes without making a mistake? 😀

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Michael Valentine Jones (3/20/2009)[/b

    Not that I would ever recommend using this to get the next ID for anything I was designing, but when someone has already put something like this is place, it can really be a problem.

    Using IDENTITY to generated valid unique primary key, then IDENTITY can no longer be used for a user-defined sequence number.

    Besides, when inserting a new item in between items already in the table, IDENTITY could not work to resetting the sequence.

  • Jeff Moden (3/21/2009)


    Lynn Pettis (3/20/2009)


    Jeff Moden (3/20/2009)


    If you did it your way, which numbers would be reserved if you needed 100 numbers as the increment?

    You know, I was looking at it from an increment of 1. It may take additional thought to see how it could be made to work if you were taking a block of values to use sequentially by one in memory, but I'm sure it could be done. Perhaps I'll look into this over the next week or so if I have some time. Could make for a simple article or blog. Definitely one for the mind.

    Absolutely correct... consider this... let's say the NextID is sitting at the number 1000. By definition of the process, that's the "Next ID available". You want to reserve 100 numbers to use.... so the Increment is 100. Here's what happens

    Starting Condition:

    NextID = 1000, Increment = 100.

    Post Update Condition:

    NextID = NextID+Increment = 1000+100 = 1100 for the next usage.

    Post Select Condition:

    @NextID = NextID-Increment = 1100-100 = 1000... the original NextID value for this run.

    Final Condition:

    @NextID = 1000, the original value correctly being sought.

    NextID = 1100 which is the Next ID available to the next person/run and makes all the numbers from 1000 to 1099 (100 numbers if you count them including the 1000) unavailable to anyone else, effectively reserving them.

    Using either a temp table with an IDENTITY starting at 0 or a ROW_NUMBER() starting at 1, you can use the following formulas to insert new rows into the target table with the calculated PK (let's just call it "ID") like this...

    --===== Temp table method

    INSERT INTO dbo.TargetTable

    (ID, ...list of columns...)

    SELECT @NextID + t.RowNum AS ID

    ...list of columns...

    FROM SomeTempTable

    --===== ROW_NUMBER() method

    INSERT INTO dbo.TargetTable

    (ID, ...list of columns...)

    SELECT @NextID + ROW_NUMBER() OVER (ORDER BY SomeColumn(s)) -1 AS ID,

    ...list of columns...

    FROM SomeJoinedTablesOrWhatever

    Because both the variable @NextID and the value in the table column NextID are both derived during the same implicit transaction, there's no chance of someone "getting in between" a Select and Update like there might be when using other methods either on the target table or the NextID table. Sure, you can prevent that, but you'd need to use an explicit 2 part declared transaction like the following traditional method...

    SET XACT_ABORT ON

    BEGIN TRANSACTION

    SELECT @NextID = NextID

    FROM dbo.NextID

    WHERE TableName = @TableName

    UPDATE dbo.NextID

    SET NextID = NextID + Increment

    WHERE TableName = @TableName

    COMMIT

    ... and THAT exact code was what caused an average of 640 deadlocks per day with spikes to as many as 4000 per day on the system at work. Just to be complete, here's the code, again, that made all but a small handful of them (caused by other things) go away...

    UPDATE dbo.NextID

    SET @NextID = NextID = NextID + @Increment

    WHERE TableName = @TableName

    SELECT @NextID = @NextID - @Increment

    It's very, very quick, allows for the reservation of more than just one NextID at a time with no additional overhead, and there's no chance of accidental duplication, ever, all while resolving the NextID or Sequence table as a source of deadlocks.

    Thinking ahead, it would be really cool if Microsoft either made it so functions could update tables or they'd just create a NextVal function like what some engines have so people wouldn't have to jump through hoops when they're forced to NOT use IDENTITY as an incremental source for numbering rows (ie. sequences) in a table .

    Maybe, maybe not. I think it can be done as I proposed, I just don't have the time this weekend (at least at the moment, as we are getting things ready for a birthday party) to do more in depth study. Let's table the discussion for a bit, and we can bring it up again in a new thread in the next couple of weeks. I really want to work this through and see it. This could, in fact, be a excellent start for an article and/or blog (by the way Steve, hate to be a pain, but have figured out how i can change my password so I can use LiveWriter with my blogs on SSC and blogger?).

    If I find this going to a article, I'll be sure to send it to you for technical review. Please be sure, I want to do this to see for myself if it can be done and if so how much additional work/thought has to go into it, and to compare both methods against each other. As you would say, put it in code. I am willing to do so, and if your method is the best, I'll admit it, no problem.

    Agreed?

  • Sure... well, maybe except for the article thing. I was thinking that the idea of using a sequence table correctly would make a good article right after I wrote my first suggestion about it on this thread. Perhaps a collaborative article, hmmm?

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff Moden (3/21/2009)


    Sure... well, maybe except for the article thing. I was thinking that the idea of using a sequence table correctly would make a good article right after I wrote my first suggestion about it on this thread. Perhaps a collaborative article, hmmm?

    That sounds like a plan. Perhaps we should take up planning offline. This way we can establish the proper testing environment etc.

  • Sounds like a plan... like you, I'm a bit busy this weekend but I'll get things going on this soon. We should also explore some of the other methods and their problems like SELECT MAX(), etc. Heh... I'm sure I know someone that can setup a million row test table... 🙂

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • We also need the proper "business" scenario. Could be something based on our past experience or something similiar. I guess we should come up with some ideas.

  • J (3/21/2009)


    Michael Valentine Jones (3/20/2009)[/b

    Not that I would ever recommend using this to get the next ID for anything I was designing, but when someone has already put something like this is place, it can really be a problem.

    Using IDENTITY to generated valid unique primary key, then IDENTITY can no longer be used for a user-defined sequence number.

    Besides, when inserting a new item in between items already in the table, IDENTITY could not work to resetting the sequence.

    Ok, but so what? The point of using an identity is to generate a unique primary key, not a user-defined sequence number.

    Exactly what value a user-defined sequence number has in the majority of real applications is very debatable. Why exactly would an order number generated by an identity be less suitable than an order generated by a NextID table? If Amazon gives you an order number like 00289772929801825, do you think it matters how it was generated?

  • Lynn Pettis (3/21/2009)


    We also need the proper "business" scenario. Could be something based on our past experience or something similiar. I guess we should come up with some ideas.

    Nah... just a disclaimer at the beginning saying what Michael just said in the post above this one. Except for what I call "Dual Master Replication", it's generally a mistake to use a sequence table. The only reason why I'd want to write such an article is for those people who have had the mistake heaped upon them so they can at least do it without a bazillion deadlocks showing up. That's what happened to me... the business scenario was that a bunch of 3rd party dummies that didn't really know squat about T-SQL wrote a billing package that used a sequence table... I had to figure out a way to fix the 640 average deadlocks per day and the code I posted was how I did it. Heh... I guess that's the business scenario... forced to fix other people's mistakes.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • The thread on the link below is a good example of a way not to generate a Next ID. I had thought about this method myself, so this is a good warning about how things can go wrong if you don't test an idea. My guess is he will be needing a valid New ID method real soon now.

    rowversion (timestamp) not unique?

    http://www.sqlteam.com/forums/topic.asp?TOPIC_ID=121777

  • Michael Valentine Jones (3/22/2009)


    The thread on the link below is a good example of a way not to generate a Next ID. I had thought about this method myself, so this is a good warning about how things can go wrong if you don't test an idea. My guess is he will be needing a valid New ID method real soon now.

    rowversion (timestamp) not unique?

    http://www.sqlteam.com/forums/topic.asp?TOPIC_ID=121777

    Heh... that's a good one, Michael... It never ceases to amaze me how people will actually go out of their way to blow stuff up on a perfectly good database system. 😛

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • That was interesting, Michael. Amazing what people do. I've never depending on rowversion, and never would.

  • [font="Verdana"]For a real world scenario, can I suggest you consider world-wide logistics? Say you have sites around the world generating shipping bills (bills of lading?), but you want to collect up all of those bills into one central location for reporting.

    A good design will allow each of the branches to process (largely) independently, with easy meshing of the results. So pre-allocating numbers for keys is a good scheme.

    Just out of interest, the team I was working with (not me, they'd solved it before I arrived) solved this issue in much the same way as what Jeff has been talking about.

    [/font]

  • Bruce W Cassidy (3/22/2009)


    [font="Verdana"]For a real world scenario, can I suggest you consider world-wide logistics? Say you have sites around the world generating shipping bills (bills of lading?), but you want to collect up all of those bills into one central location for reporting.

    [/font]

    In the real world I implemented my approach in the application for world-wide supplier.

    It's only about invoices and payments, not shipping bills.

    Exactly as you say: it collects all invoices into one central location for payment allocations, dispute resolutions and reports.

    But in real world you cannot change ERP applications the data is coming from. So, in fact I needed to use sql_variant to accommodate all sorts of ID's we have to deal with.

    And believe me, mapping of SourceSystem+OriginalID to InternalID works perfectly.

    No service required.

    _____________
    Code for TallyGenerator

Viewing 15 posts - 346 through 360 (of 465 total)

You must be logged in to reply to this topic. Login to reply