Concurrency, Insert Into A Table To Capture A Range Of Numbers

Question

Concurrency, Insert Into A Table To Capture A Range Of Numbers

Viewing 15 posts - 16 through 30 (of 42 total)

You must be logged in to reply to this topic. Login to reply

wesigler SSC Journeyman Points: 84 More actions · Answer 1

The typo is an excellent point. My approach is to come up with a couple of options and then look at the pros and cons of each. You have given me one which is the update. The other is to do an insert. I believe with the insert I have to set the transaction isolation level to serializable but I am not sure. I am looking for guidance on that. I believe setting transaction isolation level to serializable can result in a higher incidence of deadlocks which would make the update approach more attractive.

sgmunson SSC Guru Points: 110651 More actions · Answer 2

Here's the BOL entry on SET TRANSACTION ISOLATION LEVEL:

"SERIALIZABLE
Specifies the following:
Statements cannot read data that has been modified but not yet committed by other transactions.
No other transactions can modify data that has been read by the current transaction until the
current transaction completes.
Other transactions cannot insert new rows with key values that would fall in the range of keys
read by any statements in the current transaction until the current transaction completes.
Range locks are placed in the range of key values that match the search conditions of each
statement executed in a transaction. This blocks other transactions from updating or inserting
any rows that would qualify for any of the statements executed by the current transaction.
This means that if any of the statements in a transaction are executed a second time, they will
read the same set of rows. The range locks are held until the transaction completes. This is
the most restrictive of the isolation levels because it locks entire ranges of keys and holds
the locks until the transaction completes. Because concurrency is lower, use this option only
when necessary. This option has the same effect as setting HOLDLOCK on all tables in all SELECT
statements in a transaction."

Looks like exactly what's needed, but I'm not the expert either. However, I'd like to believe that the number of concurrent users and overall number of requests would be low enough that a well-performing update query wouldn't be holding that lock for terribly long. Get it down to sub-second response and I'd imagine the likelihood starts to drop considerably.

Steve

(aka smunson)

:-):-):-)

Steve (aka sgmunson) 🙂 🙂 🙂
Rent Servers for Income (picks and shovels strategy)

wesigler SSC Journeyman Points: 84 More actions · Answer 3

I will go with that unless I hear something else. I'll test the response time as well. Thanks.

Jeff Moden SSC Guru Points: 1004680 More actions · Answer 4

I'm still wondering about that primary key.... What happens if a company needs more numbers? Maybe it's in one of the previous posts. Haven't had time to study them, yet.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

wesigler SSC Journeyman Points: 84 More actions · Answer 5

They'll get a new rangeid so if they first went in as IBM they will next go in as IBM2.

Jeff Moden SSC Guru Points: 1004680 More actions · Answer 6

wesigler (7/6/2009)
My approach is to come up with a couple of options and then look at the pros and cons of each.

Then, my approach would be to avoid all deadlocks and other problems associated with maintaining such a table of pre-reserved blocks of ID's. The way to do this is simply to let each company do the necessary inserts with the appropriated ID for the company and let an IDENTITY column take care of the record ID's. The idea of having sequential blocks of ID's doesn't seem to have any merit here. If the goal is to limit the number of ID's based on sales, then do a count of ID's owned by the company prior to the insert of additional rows.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Jeff Moden SSC Guru Points: 1004680 More actions · Answer 7

smunson (7/6/2009)
Jeff Moden: I'd really appreciate your input on this...

It might not be what you expect... my input is that in this modern day of computers and other new fangled stuff 😉 like auto-numbering columns, indexes to support high speed aggregates and the like, there is no reason to preassign ranges of numbers for anything... especially for a common multi-company table. Even partitioned tables would be worked by date and CompanyID and not by record ID.

In fact, to avoid any type of clash and to give each company their own set of numbers starting at "1" or whatever number they want to start it at, it seems to me than 1 table per company each with it's own identity column would be the way to go. It would also alleviate any deadlock problems that people are currently fretting about on this problem right now.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Jeff Moden SSC Guru Points: 1004680 More actions · Answer 8

smunson (7/6/2009)
Here's the BOL entry on SET TRANSACTION ISOLATION LEVEL:
"SERIALIZABLE
Specifies the following:
Statements cannot read data that has been modified but not yet committed by other transactions.
No other transactions can modify data that has been read by the current transaction until the
current transaction completes.
Other transactions cannot insert new rows with key values that would fall in the range of keys
read by any statements in the current transaction until the current transaction completes.
Range locks are placed in the range of key values that match the search conditions of each
statement executed in a transaction. This blocks other transactions from updating or inserting
any rows that would qualify for any of the statements executed by the current transaction.
This means that if any of the statements in a transaction are executed a second time, they will
read the same set of rows. The range locks are held until the transaction completes. This is
the most restrictive of the isolation levels because it locks entire ranges of keys and holds
the locks until the transaction completes. Because concurrency is lower, use this option only
when necessary. This option has the same effect as setting HOLDLOCK on all tables in all SELECT
statements in a transaction."
Looks like exactly what's needed, but I'm not the expert either. However, I'd like to believe that the number of concurrent users and overall number of requests would be low enough that a well-performing update query wouldn't be holding that lock for terribly long. Get it down to sub-second response and I'd imagine the likelihood starts to drop considerably.
Steve
(aka smunson)
:-):-):-)

Serializable is the default.

No matter what happens, in order to guarantee that no one "gets in" on your numbers, you must start an explicit transaction, do a select, do an insert based on the outcome of that select, and then to a commit... perfect spot for deadlocks to happen unless you also do a TABLOCKX or at least an UPDLOCK and I'm pretty sure that UPDLOCK won't do the job in this case because it allows others to read... that would be devastating in this case because they could read something they couldn't have. Pray that the frequency of inserts will be low or a lot of people could end up waiting.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

sgmunson SSC Guru Points: 110651 More actions · Answer 9

I'd agree but for the "human factor". Often, people have to deal with these numbers regularly, and a lack of "organization" of those numbers can be a genuine challenge to people who are working by the hour. They'll prefer a familiar prefix for a number involving a given client, instead of just some number. Thus the number conveys information, and you might think that with all our automation, we'd have overcome our fear of numbers, but alas, I'm not even sure the very best of education could quell what may just be "human nature". This kind of "accommodation" for human nature can often be responsible for helping to catch potential data entry errors, because someone know what the number should start with, and sees something unfamiliar. That's impossible without pre-assigned ranges or other means of the number communicating something about who it's assigned to. We don't really know if that applies in this case, so let's hear from the original poster...

Steve

(aka smunson)

:-):-):-)

Jeff Moden (7/6/2009)
smunson (7/6/2009)
Jeff Moden: I'd really appreciate your input on this...
It might not be what you expect... my input is that in this modern day of computers and other new fangled stuff 😉 like auto-numbering columns, indexes to support high speed aggregates and the like, there is no reason to preassign ranges of numbers for anything... especially for a common multi-company table. Even partitioned tables would be worked by date and CompanyID and not by record ID.
In fact, to avoid any type of clash and to give each company their own set of numbers starting at "1" or whatever number they want to start it at, it seems to me than 1 table per company each with it's own identity column would be the way to go. It would also alleviate any deadlock problems that people are currently fretting about on this problem right now.

Steve (aka sgmunson) 🙂 🙂 🙂
Rent Servers for Income (picks and shovels strategy)

wesigler SSC Journeyman Points: 84 More actions · Answer 10

Jeff, I am a little confused by your last two posts. The Alternate IDs start at 800000000 and go to 899999999. I don't see how you can get this with an identity column. I know you can start an identity column at a given number, but are you suggesting I do an insert for each alternate id? So if a company needs 50,000 Alternate IDs I do 50,0000 inserts. And then the other way is hava a table for each company so If I have 1000 companies I have 1000 tables?

Jeff Moden SSC Guru Points: 1004680 More actions · Answer 11

smunson (7/6/2009)
I'd agree but for the "human factor". Often, people have to deal with these numbers regularly, and a lack of "organization" of those numbers can be a genuine challenge to people who are working by the hour.

Heh... my recommendation would be to do the accounting on paper then to accomodate the "human factor". 😉 Obviously, I'm kind of kidding there.

Then, the only hope of pulling this off with no chance of someone getting someone else's numbers during simultaneous transactions is to use an explicit transaction with a TABLOCKX to try to prevent deadlocks and, like I said, pray for a very low number of very infrequent transactions.

Also, having a primary key that allows a company to appear only once is a bit counter intuitive. IBM is IBM... not IBM2 which is a form of denormalization by itself. There needs to be something else for a primary key regardless. Because of the range thing, the business logic behind this needs to be absolutely air-tight because you cannot make a constraint to check if the current row values fall between other row values. You could add a trigger to do that level of constraining but that would also be included in the transaction further aggrevating the possibility of a deadlock.

It's my belief that atransaction that uses a TablockX would assist the necessary business logic but, be warned, the proc that gets the new numbers must not itself be used in a transaction or deadlocks will surely occur.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Jeff Moden SSC Guru Points: 1004680 More actions · Answer 12

There may be even a better way to avoid deadlocks on this. Combine the Insert with the Select... I believe you'd still need the TABLOCKX because you're needing to look at the whole table for gaps (this is where the problem rears it's ugly head with stuff like this) but an explicit 2 part transaction could be avoided which should further reduce the possibility of a deadlock.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Jeff Moden SSC Guru Points: 1004680 More actions · Answer 13

wesigler (7/6/2009)
Jeff, I am a little confused by your last two posts. The Alternate IDs start at 800000000 and go to 899999999. I don't see how you can get this with an identity column. I know you can start an identity column at a given number, but are you suggesting I do an insert for each alternate id? So if a company needs 50,000 Alternate IDs I do 50,0000 inserts. And then the other way is hava a table for each company so If I have 1000 companies I have 1000 tables?

If a company isn't going to use each alternate id separately, why are we calculating what they should be in the first place? Do you mean to tell me that these alternate id's will NOT be used anywhere else in the database? Maybe my suggestion of doing it all on paper wasn't so far off. 😛

Also, I'm not suggesting inserting 50,000 id's at once (although that's a pretty small number of rows to begin with and it would solve all these problems)... rather I'm suggesting that 50,000 id's be automatically doled out one per row inserted (using IDENTITY) as they are used elsewhere in the database... that's provided they're used somewhere else in the database.

Shifting gears, you're correct... if you have 1,000 companies, then partitioning by individual company may be a bit of a problem.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

WayneS SSC Guru Points: 95461 More actions · Answer 14

You don't have to give "blocks" of numbers out.

If you had tables like:

CREATE TABLE Company (

CompanyID int NOT NULL identity PRIMARY KEY,

Name varchar(50) NOT NULL)

CREATE TABLE AlternateID (

CompanyID int NOT NULL,

AltIDNumber int NOT NULL,

PRIMARY KEY (CompanyID, IDNumber))

Each company could have an alternate id starting at 1.

The whole alternate id # would be the CompanyID + AltIDNumber, ie:

01050-000000001 - 01050-999999999

13852-000000001 - 13852-999999999

etc. Here each company could have 999,999,999 unique numbers to use.

Change both of the int to smallint, and you could have:

0105000001 - 0105099999

1385200001 - 1385299999

3276800001 - 3276899999

each company here would have 99,999 unique numbers to use. This might be enough, but not sure about the larger companies.

While each company's range would start at one, since the whole alternate id # includes the company identifier, there would not be any duplicates.

Wayne
Microsoft Certified Master: SQL Server 2008
Author - SQL Server T-SQL Recipes

If you can't explain to another person how the code that you're copying from the internet works, then DON'T USE IT on a production system! After all, you will be the one supporting it!

Links:
For better assistance in answering your questions
Performance Problems
Common date/time routines
Understanding and Using APPLY Part 1 & Part 2

Jeff Moden SSC Guru Points: 1004680 More actions · Answer 15

Idea understood... but that's a sequence table and that's even worse that what's trying to be done for updates unless you're very, very, very careful and you use a 3 part update to do the dirty work. It also makes it a real bitch to do multi-row inserts. We had one at work very similar to that. It caused an average of 640 deadlocks per day with spikes to 4000 until we fixed it.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)