Best Practice

Question

Best Practice

Viewing 15 posts - 16 through 30 (of 58 total)

You must be logged in to reply to this topic. Login to reply

Mauve SSChampion Points: 11316 More actions · Answer 1

AJ (9/4/2008)
Good question, but I think it is being confused, including by me, with an UPSERT (as the previous posts' referral to MERGE indicates too).
This question needs more information to answer correctly: How is the action dealt with if the INSERT is not performed? An UPDATE instead? Maybe we actually want a unique constraint on the Name column to report an error if a name is attempted to be inserted twice? If we do need to update instead if the record already exists, how do we control concurrency in that case?
In high concurrency designs these and many more questions must be answered first and I completely agree a more deterministic approach would be required.
A big misconception I keep coming across is that there must be a perfect singleton solution for all situations.... but why not perfect the solution for each singleton situation? A welcome side-effect of this is that it keeps us all in jobs!

I agree. I think that the question should have stated "what is the most efficient way..."

Depending upon what is to happen if the value already exists, would also possibly influence the choice of method to be used. In this simple example with only one column, I note the following (performance and locking issues aside):

1. The first option (IF statement) has the capability of returning information back to the UI to inform the user that the value already exists.

2. Option 2 would have to check the ROWCOUNT from the insert to see if the row was inserted and then pass the appropriate information back. Otherwise, the application would [wrongly] assume that the row was added. So we're still coding an IF statement.

And I wouldn't, in this simple case, use the new SQL Server 2008 MERGE statement to update a value to itself.

[font="Arial Narrow"](PHB) I think we should build an SQL database. (Dilbert) What color do you want that database? (PHB) I think mauve has the most RAM.[/font]

Tom Garth SSCertifiable Points: 6193 More actions · Answer 2

A GREAT QoD! I got it wrong, then read how obvious the answer was. It's amazing the things you don't take time to think about. This will change my practice.

I've done SELECT 1 for years out of habit. I never saw SELECT NULL used. That's pretty cool. I might switch.

Thanks for the tidbit Sergiy.

Tom Garth
Vertical Solutions[/url]

"There are three kinds of men. The one that learns by reading. The few who learn by observation. The rest of them have to pee on the electric fence for themselves." -- Will Rogers

Mauve SSChampion Points: 11316 More actions · Answer 3

Tom Garth (9/4/2008)
... I never saw SELECT NULL used. That's pretty cool. I might switch.

I discovered the SELECT NULL syntax by observing the internal SQL statements issued by Oracle when it is checking FK constraints. So if it was good syntax for the engine, then it must be good for us programmers.

Side note: Both DB2 and Oracle, both architected from the same original System R specification, use SQL to run the engine. This was a good thing, in that it forced optimization as well as enhancements to the SQL language.

One of the first things that DB2 encountered was heavy lock contention and deadlocking on its own internal dictionary tables as it was taking page-level locks. The next release (circa 1984) had row-level locking.

[font="Arial Narrow"](PHB) I think we should build an SQL database. (Dilbert) What color do you want that database? (PHB) I think mauve has the most RAM.[/font]

Ric Sierra SSCrazy Points: 2260 More actions · Answer 4

I'm not completly agree with the answer, because depends of the context:

If you are looking for the best execution plan the answer is #1

If you are looking for less deadlock the answer is #2

But if the target is INSERT a new record, both solutions works.

Alan. T. Hall of Fame Points: 3221 More actions · Answer 5

Great question! I have seen many co-workers create queries like this that work really well when all of the data can be copied onto memory on their test but when it get to the big production server it brings the system to a halt locking huge numbers of records and if let run long enough causing a huge rollback.

Tom Garth SSCertifiable Points: 6193 More actions · Answer 6

Ric Sierra (9/4/2008)
I'm not completly agree with the answer, because depends of the context:
If you are looking for the best execution plan the answer is #1
If you are looking for less deadlock the answer is #2
But if the target is INSERT a new record, both solutions works.

In my mind, it's the unknown adventure between the locks that should be avoided.

Tom Garth
Vertical Solutions[/url]

"There are three kinds of men. The one that learns by reading. The few who learn by observation. The rest of them have to pee on the electric fence for themselves." -- Will Rogers

Lynn Pettis SSC Guru Points: 442467 More actions · Answer 7

Could just be me, but answer 2 just seemed to be the logically correct answer to the question.

😎

TimothyAWiseman SSCrazy Eights Points: 8819 More actions · Answer 8

This was one of the best questions with the best explanation I have seen out there.

IT does seem slightly dated now that 2008 is out though. In 2008 it seems that Merge would be the best option.

---
Timothy A Wiseman
SQL Blog: http://timothyawiseman.wordpress.com/

Lynn Pettis SSC Guru Points: 442467 More actions · Answer 9

Not sure if MERGE is needed here. The question only indicted that a record needed to be inserted if it did not already exist in the database. Nothing in the question indicated that an update should be done if the record already existed. With no update requirement, why use MERGE?

😎

Christian Buettner-167247 SSChampion Points: 13729 More actions · Answer 10

Hugo Kornelis (9/4/2008)
Christian Buettner (9/4/2008)
I have adjusted my post above to remove the wrong information.
.... that I seem to have overlooked thus far. And now I find myself wondering what it is that you have written.
Sad, isn't it? 😀

Well, if you look close enough, you might still find my quote on the page :o)

Best Regards,

Chris Büttner

dmoldovan SSCertifiable Points: 6334 More actions · Answer 11

I agree with Christian & Ric Sierra - great question, but the option to choose depends on other circumstances (i.e. indexes, isolation level, how much performance you need.....)

Hugo Kornelis SSC Guru Points: 64790 More actions · Answer 12

Lynn Pettis (9/4/2008)
Not sure if MERGE is needed here. The question only indicted that a record needed to be inserted if it did not already exist in the database. Nothing in the question indicated that an update should be done if the record already existed. With no update requirement, why use MERGE?
😎

Hi Lynn,

Needed? No, not needed. The INSERT WHERE NOT EXISTS works great as well.

But it's definitely possible to do it with MERGE as well. Though MERGE is a great way to do the "update-or-insert" routine, it has many other possible uses as well. For instance, you can use it to replace the non-standard and dangerous UPDATE FROM syntax (see my blog for details). And you can also use it to do a conditional insert:

MERGE INTO Customer

USING (SELECT @CustomerName) AS New(Name)

ON New.Name = Customer.Name

WHEN NOT MATCHED

THEN INSERT (Name) VALUES (New.Name);

I was unable to detect any performance difference with the INSERT WHERE NOT EXISTS version, so I guess it eventually boils down to personal preference.

Hugo Kornelis, SQL Server/Data Platform MVP (2006-2016)
Visit my SQL Server blog: https://sqlserverfast.com/blog/
SQL Server Execution Plan Reference: https://sqlserverfast.com/epr/

Ric Sierra SSCrazy Points: 2260 More actions · Answer 13

MERGE is an alternative to reach the target -Insert a new row without duplicates-, but is the last alternative I would choose for this case.

You spend a lot of code and avoid the simplicity. Hard to read for.

Plus: Using MERGE is a commitment to be MSSQL 2008 dependant. Isn't it?

Lynn Pettis SSC Guru Points: 442467 More actions · Answer 14

Also, MERGE isn't available in SQL Server 2005, and our LOB systems will not migrate to SQL Server 2008 until the vendors support SQL Server 2008. We just now (in the last year) were able to migrate to SQL Server 2005. I don't see a move to SQL Server 2008 for at least 2 or 3 years based on past experience.

😎

Sergiy SSC Guru Points: 110208 More actions · Answer 15

Ric Sierra (9/4/2008)
I'm not completly agree with the answer, because depends of the context:
If you are looking for the best execution plan the answer is #1

Can you prove it?

Open QA, run the query from the answer and see that execution plans are identical.

Just 1st option include 2 statements in execution plan, and 2nd one is all in one.

Estimation cost is also 50% for both.

If you are looking for less deadlock the answer is #2

I don't see any difference in regards to deadlocks.

Can you prove your point?

But if the target is INSERT a new record, both solutions works.

You did not get the explanation for the answer.

The point of the question was to illustrate that #1 does not always work.

Using #1 you embed into your code a chance for unexpected application crashes.

_____________
Code for TallyGenerator