December 4, 2012 at 4:27 am
I am quite new to this, I'll try to explain my task and how I thought to solve it.
I have a Python process that needs to access a table, which key is an integer. When I want to
insert a new record, first I find the highest key with a
SELECT max(id) FROM TASKS
After that, given I captured that result in a variable id, I execute a
INSERT INTO TASKS
VALUES (id+1, ..., ..., ...).
It is working fine. What troubles me is that there will be different instances of this process running,
and I thought it is possible that two of them launch the first query, get the same max id, and then they will try to perform two insertions with the same primary key, which will obviously fail.
I thought the problem would be solved if, just before starting, the process would block the table
TASKS to everyone else, not allowing them to read. And the it would be restored after the insertion has been performed. How can I do this?
Or should I follow an entirely different approach?
Thanks in advance,
December 4, 2012 at 6:41 am
The appropriate locking should take place by default - you can provide hints if you wish to lock objects at different levels (row, page, table), and you can force queries to read uncommitted changes etc.
In your case, you are correct that another process may grab the same ID, so I would encapsulate the finding of the max(id) FROM TASKS and the INSERT statement into a transaction.
You may want to think about having your key with the auto-increment property set to true? then you won't even have to worry about it.
Also, you may want to consider adding a TRY ... CATCH to your proc in case the ID is already taken, your proc could try with a new incremented ID.
B
December 4, 2012 at 6:58 am
Try doing it in a single statement
Insert into Tasks ...
SELECT max(Id) + 1, ... From Tasks
That should work.
Otherwise you will need to force a different lock on the select. It's not a read lock that you want to force, SQL already takes one of those by default. You need a more restrictive lock to prevent multiple sessions running the max at the same time.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
December 4, 2012 at 8:20 am
Thank you both for your quick replies.
bleroy (12/4/2012)
In your case, you are correct that another process may grab the same ID, so I would encapsulate the finding of the max(id) FROM TASKS and the INSERT statement into a transaction.B
Maybe this question is dumb. If I do this, having both queries are encapsulated in a transaction, and two different processes launch this transaction, the problem cannot happen? Is it sure that one of the transactions is executed completely before the other can start?
I also thought of what GilaMonster suggested, (controling the case when the table is empty, and the result of max is null). But I decided to ask for a more general solution.
December 4, 2012 at 8:31 am
zeycus (12/4/2012)
Maybe this question is dumb. If I do this, having both queries are encapsulated in a transaction, and two different processes launch this transaction, the problem cannot happen? Is it sure that one of the transactions is executed completely before the other can start?
No, not at all. If you just encapsulate that in a transaction it will make absolutely no difference whatsoever to the behaviour (assuming default isolation level). Two connections will be able to start the transaction, compute the same max, try to insert and one will fail.
You need to either do it in one operation as I suggested or you need more restrictive locks (shared locks are just that. Shared)
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
December 4, 2012 at 8:35 am
GilaMonster (12/4/2012)
zeycus (12/4/2012)
Maybe this question is dumb. If I do this, having both queries are encapsulated in a transaction, and two different processes launch this transaction, the problem cannot happen? Is it sure that one of the transactions is executed completely before the other can start?No, not at all. If you just encapsulate that in a transaction it will make absolutely no difference whatsoever to the behaviour (assuming default isolation level). Two connections will be able to start the transaction, compute the same max, try to insert and one will fail.
You need to either do it in one operation as I suggested or you need more restrictive locks (shared locks are just that. Shared)
I understand, thanks a lot. I will follow your suggestion to encapsulate it in a single query. If in the future I have the same problem with a more complex task, I will deal with it then 🙂
December 4, 2012 at 8:36 am
:blush: .. thanks for the correction GilaMonster
December 4, 2012 at 2:03 pm
zeycus (12/4/2012)
...first I find the highest key with aSELECT max(id) FROM TASKS
After that, given I captured that result in a variable id, I execute a
INSERT INTO TASKS
VALUES (id+1, ..., ..., ...).
...
Is there a reason you're not using IDENTITY for that id column? If you create the table like this:
CREATE TABLE TASKS
(id INT IDENTITY(1,1) NOT NULL,
...
then SQL Server will handle ensuring id will be different if 2 sessions try to insert a new record at the same time, instead of trying to re-invent the wheel yourself.
December 5, 2012 at 2:34 am
Chris Harshman (12/4/2012)
zeycus (12/4/2012)
...first I find the highest key with aSELECT max(id) FROM TASKS
After that, given I captured that result in a variable id, I execute a
INSERT INTO TASKS
VALUES (id+1, ..., ..., ...).
...
Is there a reason you're not using IDENTITY for that id column? If you create the table like this:
CREATE TABLE TASKS(id INT IDENTITY(1,1) NOT NULL,
...
then SQL Server will handle ensuring id will be different if 2 sessions try to insert a new record at the same time, instead of trying to re-invent the wheel yourself.
Thank you, of course you are right. Someone suggested that yesterday and I changed it. But anyway I wanted to know how to do this in a general case.
December 5, 2012 at 5:07 am
Chris Harshman (12/4/2012)
Is there a reason you're not using IDENTITY for that id column?
then there is no guarantee that destination table will have same ID for same records as it has in source table (remember...sequence of insert query will decide the id in case of identity colum in destination table)
-------Bhuvnesh----------
I work only to learn Sql Server...though my company pays me for getting their stuff done;-)
December 5, 2012 at 6:20 am
Bhuvnesh (12/5/2012)
then there is no guarantee that destination table will have same ID for same records as it has in source table (remember...sequence of insert query will decide the id in case of identity colum in destination table)
Since his destination and source tables are the same and the inserts are single row, neither is a concern here.
Identity would be the easiest approach, if we've seen the entire picture.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
Viewing 11 posts - 1 through 11 (of 11 total)
You must be logged in to reply to this topic. Login to reply