|
|
|
SSCarpal Tunnel
       
Group: General Forum Members
Last Login: Wednesday, May 15, 2013 5:01 PM
Points: 4,540,
Visits: 8,184
|
|
|
|
|
|
SSCrazy
      
Group: General Forum Members
Last Login: Wednesday, April 24, 2013 5:02 AM
Points: 2,365,
Visits: 1,825
|
|
oops... Nice question sergiy. I was surprised with the number of incorrect answers myself included.:)
"Keep Trying"
|
|
|
|
|
SSCrazy
      
Group: General Forum Members
Last Login: Today @ 1:06 AM
Points: 2,525,
Visits: 3,618
|
|
Nice question, but not "totally" deterministic in my opinion. What If I ensure consistency through the transaction isolation level already?
SET TRANSACTION ISOLATION LEVEL REPEATABLE READ BEGIN TRANSACTION IF NOT EXISTS... INSERT... COMMIT Edit: Deleted wrong information from me regarding Performance without indexes
Best Regards,
Chris Büttner
|
|
|
|
|
SSCertifiable
       
Group: General Forum Members
Last Login: Today @ 9:48 AM
Points: 5,235,
Visits: 7,037
|
|
Christian Buettner (9/4/2008)
Nice question, but not "totally" deterministic in my opinion. What If I ensure consistency through the transaction isolation level already? SET TRANSACTION ISOLATION LEVEL REPEATABLE READ BEGIN TRANSACTION IF NOT EXISTS... INSERT... COMMIT In this case, both options seem to be equal(from a performance perspective) if there is an index on Name. If no index is defined, the first one performs better in my tests. Input on this is highly welcome.
Hi Chris,
Setting a higher transaction isolation level will indeed prevent problems from a different connection inserting the row in just the wrong moment. (I'm not sure at the moment if REPEATABLE READ suffices or if you need SERIALIZABLE though - maybe after some coffee).
I still consider the other option the better one, because it allows the optimizer to come up with a more efficient plan. After all, the check for existence can only be done by finding the clustered index page where the row should be - and that same page has to be found for actually inserting it, so there's no need to traverse the B-tree twice. I don't know if it really works this way in any current version of SQL Server, but even if it doesn't, a future version mght implement such an optimization. And that alone is sufficent reason for me to prefer the second option.
Great question, Sergiy!
Best, Hugo
PS: Of course, with SQL Server 2008, neither of these options is the "best" anymore; this is a typical scenario for the new MERGE statement. :)
Hugo Kornelis, SQL Server MVP Visit my SQL Server blog: http://sqlblog.com/blogs/hugo_kornelis
|
|
|
|
|
SSC Rookie
      
Group: General Forum Members
Last Login: Wednesday, September 10, 2008 7:34 AM
Points: 38,
Visits: 41
|
|
Good question, but I think it is being confused, including by me, with an UPSERT (as the previous posts' referral to MERGE indicates too).
This question needs more information to answer correctly: How is the action dealt with if the INSERT is not performed? An UPDATE instead? Maybe we actually want a unique constraint on the Name column to report an error if a name is attempted to be inserted twice? If we do need to update instead if the record already exists, how do we control concurrency in that case?
In high concurrency designs these and many more questions must be answered first and I completely agree a more deterministic approach would be required.
A big misconception I keep coming across is that there must be a perfect singleton solution for all situations.... but why not perfect the solution for each singleton situation? A welcome side-effect of this is that it keeps us all in jobs!
|
|
|
|
|
SSCommitted
      
Group: General Forum Members
Last Login: Thursday, March 31, 2011 1:40 AM
Points: 1,938,
Visits: 208
|
|
Chris,
Doesn't the second solution have to do the same B-tree traversal?
Good question, today, and good discussions!!
Joe
|
|
|
|
|
Ten Centuries
      
Group: General Forum Members
Last Login: Friday, February 15, 2013 2:53 PM
Points: 1,199,
Visits: 568
|
|
i thought this was like which one will perform better
|
|
|
|
|
SSCertifiable
       
Group: General Forum Members
Last Login: Today @ 9:48 AM
Points: 5,235,
Visits: 7,037
|
|
dunnjoe (9/4/2008) Chris,
Doesn't the second solution have to do the same B-tree traversal?
Good question, today, and good discussions!!
Joe
Hi Joe,
I assume you meant me, not Chris. :)
The second solution has to do the B-tree traversal at least once, to find out iff the row already exists. If not, it doesn't have to (*) repeat the traversal to perform the actual insertion.
The first solution on the other hand has two statements, that are executed after each other. So the B-tree has to be traversed for the IF, and if the NOT EXISTS is found to be true, the same B-tree traversal will then have to be repeated for the actual INSERT. Since it's two statements, there's no way the optimizer can improve on this.
(*) Disclaimer - I don't know what optimizations are actually in place. This describs theoretically possible behavior; I have not done any tests to assess what the actual behavior is like.
Hugo Kornelis, SQL Server MVP Visit my SQL Server blog: http://sqlblog.com/blogs/hugo_kornelis
|
|
|
|
|
Grasshopper
      
Group: General Forum Members
Last Login: Tuesday, January 12, 2010 7:37 AM
Points: 15,
Visits: 143
|
|
| side note: we shouldn't really be doing "select *" to check for existence. If that's a LOONNNNGGGGG row there could be a lot of overhead to no good purpose. It may be more self-documenting to code "select 'exists' from ....."
|
|
|
|
|
SSCrazy
      
Group: General Forum Members
Last Login: Today @ 1:06 AM
Points: 2,525,
Visits: 3,618
|
|
dunnjoe (9/4/2008) Chris,
Doesn't the second solution have to do the same B-tree traversal?
Good question, today, and good discussions!!
Joe Hi Joe,
yes it does. I assume I have not cleared the table between my tests. Therefore, the first option was obviously more cost efficient in the actual plan, since the insert was not applied due to the customer being already in the table.
I have adjusted my post above to remove the wrong information.
Best Regards,
Chris Büttner
|
|
|
|