Concurrency, Insert Into A Table To Capture A Range Of Numbers

  • The typo is an excellent point. My approach is to come up with a couple of options and then look at the pros and cons of each. You have given me one which is the update. The other is to do an insert. I believe with the insert I have to set the transaction isolation level to serializable but I am not sure. I am looking for guidance on that. I believe setting transaction isolation level to serializable can result in a higher incidence of deadlocks which would make the update approach more attractive.

  • Here's the BOL entry on SET TRANSACTION ISOLATION LEVEL:

    "SERIALIZABLE

    Specifies the following:

    Statements cannot read data that has been modified but not yet committed by other transactions.

    No other transactions can modify data that has been read by the current transaction until the

    current transaction completes.

    Other transactions cannot insert new rows with key values that would fall in the range of keys

    read by any statements in the current transaction until the current transaction completes.

    Range locks are placed in the range of key values that match the search conditions of each

    statement executed in a transaction. This blocks other transactions from updating or inserting

    any rows that would qualify for any of the statements executed by the current transaction.

    This means that if any of the statements in a transaction are executed a second time, they will

    read the same set of rows. The range locks are held until the transaction completes. This is

    the most restrictive of the isolation levels because it locks entire ranges of keys and holds

    the locks until the transaction completes. Because concurrency is lower, use this option only

    when necessary. This option has the same effect as setting HOLDLOCK on all tables in all SELECT

    statements in a transaction."

    Looks like exactly what's needed, but I'm not the expert either. However, I'd like to believe that the number of concurrent users and overall number of requests would be low enough that a well-performing update query wouldn't be holding that lock for terribly long. Get it down to sub-second response and I'd imagine the likelihood starts to drop considerably.

    Steve

    (aka smunson)

    :-):-):-)

    Steve (aka sgmunson) 🙂 🙂 🙂
    Rent Servers for Income (picks and shovels strategy)

  • I will go with that unless I hear something else. I'll test the response time as well. Thanks.

  • I'm still wondering about that primary key.... What happens if a company needs more numbers? Maybe it's in one of the previous posts. Haven't had time to study them, yet.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • They'll get a new rangeid so if they first went in as IBM they will next go in as IBM2.

  • wesigler (7/6/2009)


    My approach is to come up with a couple of options and then look at the pros and cons of each.

    Then, my approach would be to avoid all deadlocks and other problems associated with maintaining such a table of pre-reserved blocks of ID's. The way to do this is simply to let each company do the necessary inserts with the appropriated ID for the company and let an IDENTITY column take care of the record ID's. The idea of having sequential blocks of ID's doesn't seem to have any merit here. If the goal is to limit the number of ID's based on sales, then do a count of ID's owned by the company prior to the insert of additional rows.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • smunson (7/6/2009)


    Jeff Moden: I'd really appreciate your input on this...

    It might not be what you expect... my input is that in this modern day of computers and other new fangled stuff 😉 like auto-numbering columns, indexes to support high speed aggregates and the like, there is no reason to preassign ranges of numbers for anything... especially for a common multi-company table. Even partitioned tables would be worked by date and CompanyID and not by record ID.

    In fact, to avoid any type of clash and to give each company their own set of numbers starting at "1" or whatever number they want to start it at, it seems to me than 1 table per company each with it's own identity column would be the way to go. It would also alleviate any deadlock problems that people are currently fretting about on this problem right now.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • smunson (7/6/2009)


    Here's the BOL entry on SET TRANSACTION ISOLATION LEVEL:

    "SERIALIZABLE

    Specifies the following:

    Statements cannot read data that has been modified but not yet committed by other transactions.

    No other transactions can modify data that has been read by the current transaction until the

    current transaction completes.

    Other transactions cannot insert new rows with key values that would fall in the range of keys

    read by any statements in the current transaction until the current transaction completes.

    Range locks are placed in the range of key values that match the search conditions of each

    statement executed in a transaction. This blocks other transactions from updating or inserting

    any rows that would qualify for any of the statements executed by the current transaction.

    This means that if any of the statements in a transaction are executed a second time, they will

    read the same set of rows. The range locks are held until the transaction completes. This is

    the most restrictive of the isolation levels because it locks entire ranges of keys and holds

    the locks until the transaction completes. Because concurrency is lower, use this option only

    when necessary. This option has the same effect as setting HOLDLOCK on all tables in all SELECT

    statements in a transaction."

    Looks like exactly what's needed, but I'm not the expert either. However, I'd like to believe that the number of concurrent users and overall number of requests would be low enough that a well-performing update query wouldn't be holding that lock for terribly long. Get it down to sub-second response and I'd imagine the likelihood starts to drop considerably.

    Steve

    (aka smunson)

    :-):-):-)

    Serializable is the default.

    No matter what happens, in order to guarantee that no one "gets in" on your numbers, you must start an explicit transaction, do a select, do an insert based on the outcome of that select, and then to a commit... perfect spot for deadlocks to happen unless you also do a TABLOCKX or at least an UPDLOCK and I'm pretty sure that UPDLOCK won't do the job in this case because it allows others to read... that would be devastating in this case because they could read something they couldn't have. Pray that the frequency of inserts will be low or a lot of people could end up waiting.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • I'd agree but for the "human factor". Often, people have to deal with these numbers regularly, and a lack of "organization" of those numbers can be a genuine challenge to people who are working by the hour. They'll prefer a familiar prefix for a number involving a given client, instead of just some number. Thus the number conveys information, and you might think that with all our automation, we'd have overcome our fear of numbers, but alas, I'm not even sure the very best of education could quell what may just be "human nature". This kind of "accommodation" for human nature can often be responsible for helping to catch potential data entry errors, because someone know what the number should start with, and sees something unfamiliar. That's impossible without pre-assigned ranges or other means of the number communicating something about who it's assigned to. We don't really know if that applies in this case, so let's hear from the original poster...

    Steve

    (aka smunson)

    :-):-):-)

    Jeff Moden (7/6/2009)


    smunson (7/6/2009)


    Jeff Moden: I'd really appreciate your input on this...

    It might not be what you expect... my input is that in this modern day of computers and other new fangled stuff 😉 like auto-numbering columns, indexes to support high speed aggregates and the like, there is no reason to preassign ranges of numbers for anything... especially for a common multi-company table. Even partitioned tables would be worked by date and CompanyID and not by record ID.

    In fact, to avoid any type of clash and to give each company their own set of numbers starting at "1" or whatever number they want to start it at, it seems to me than 1 table per company each with it's own identity column would be the way to go. It would also alleviate any deadlock problems that people are currently fretting about on this problem right now.

    Steve (aka sgmunson) 🙂 🙂 🙂
    Rent Servers for Income (picks and shovels strategy)

  • Jeff, I am a little confused by your last two posts. The Alternate IDs start at 800000000 and go to 899999999. I don't see how you can get this with an identity column. I know you can start an identity column at a given number, but are you suggesting I do an insert for each alternate id? So if a company needs 50,000 Alternate IDs I do 50,0000 inserts. And then the other way is hava a table for each company so If I have 1000 companies I have 1000 tables?

  • smunson (7/6/2009)


    I'd agree but for the "human factor". Often, people have to deal with these numbers regularly, and a lack of "organization" of those numbers can be a genuine challenge to people who are working by the hour.

    Heh... my recommendation would be to do the accounting on paper then to accomodate the "human factor". 😉 Obviously, I'm kind of kidding there.

    Then, the only hope of pulling this off with no chance of someone getting someone else's numbers during simultaneous transactions is to use an explicit transaction with a TABLOCKX to try to prevent deadlocks and, like I said, pray for a very low number of very infrequent transactions.

    Also, having a primary key that allows a company to appear only once is a bit counter intuitive. IBM is IBM... not IBM2 which is a form of denormalization by itself. There needs to be something else for a primary key regardless. Because of the range thing, the business logic behind this needs to be absolutely air-tight because you cannot make a constraint to check if the current row values fall between other row values. You could add a trigger to do that level of constraining but that would also be included in the transaction further aggrevating the possibility of a deadlock.

    It's my belief that atransaction that uses a TablockX would assist the necessary business logic but, be warned, the proc that gets the new numbers must not itself be used in a transaction or deadlocks will surely occur.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • There may be even a better way to avoid deadlocks on this. Combine the Insert with the Select... I believe you'd still need the TABLOCKX because you're needing to look at the whole table for gaps (this is where the problem rears it's ugly head with stuff like this) but an explicit 2 part transaction could be avoided which should further reduce the possibility of a deadlock.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • wesigler (7/6/2009)


    Jeff, I am a little confused by your last two posts. The Alternate IDs start at 800000000 and go to 899999999. I don't see how you can get this with an identity column. I know you can start an identity column at a given number, but are you suggesting I do an insert for each alternate id? So if a company needs 50,000 Alternate IDs I do 50,0000 inserts. And then the other way is hava a table for each company so If I have 1000 companies I have 1000 tables?

    If a company isn't going to use each alternate id separately, why are we calculating what they should be in the first place? Do you mean to tell me that these alternate id's will NOT be used anywhere else in the database? Maybe my suggestion of doing it all on paper wasn't so far off. 😛

    Also, I'm not suggesting inserting 50,000 id's at once (although that's a pretty small number of rows to begin with and it would solve all these problems)... rather I'm suggesting that 50,000 id's be automatically doled out one per row inserted (using IDENTITY) as they are used elsewhere in the database... that's provided they're used somewhere else in the database.

    Shifting gears, you're correct... if you have 1,000 companies, then partitioning by individual company may be a bit of a problem.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • You don't have to give "blocks" of numbers out.

    If you had tables like:

    CREATE TABLE Company (

    CompanyID int NOT NULL identity PRIMARY KEY,

    Name varchar(50) NOT NULL)

    CREATE TABLE AlternateID (

    CompanyID int NOT NULL,

    AltIDNumber int NOT NULL,

    PRIMARY KEY (CompanyID, IDNumber))

    Each company could have an alternate id starting at 1.

    The whole alternate id # would be the CompanyID + AltIDNumber, ie:

    01050-000000001 - 01050-999999999

    13852-000000001 - 13852-999999999

    etc. Here each company could have 999,999,999 unique numbers to use.

    Change both of the int to smallint, and you could have:

    0105000001 - 0105099999

    1385200001 - 1385299999

    3276800001 - 3276899999

    each company here would have 99,999 unique numbers to use. This might be enough, but not sure about the larger companies.

    While each company's range would start at one, since the whole alternate id # includes the company identifier, there would not be any duplicates.

    Wayne
    Microsoft Certified Master: SQL Server 2008
    Author - SQL Server T-SQL Recipes


    If you can't explain to another person how the code that you're copying from the internet works, then DON'T USE IT on a production system! After all, you will be the one supporting it!
    Links:
    For better assistance in answering your questions
    Performance Problems
    Common date/time routines
    Understanding and Using APPLY Part 1 & Part 2

  • Idea understood... but that's a sequence table and that's even worse that what's trying to be done for updates unless you're very, very, very careful and you use a 3 part update to do the dirty work. It also makes it a real bitch to do multi-row inserts. We had one at work very similar to that. It caused an average of 640 deadlocks per day with spikes to 4000 until we fixed it.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 15 posts - 16 through 30 (of 42 total)

You must be logged in to reply to this topic. Login to reply