A Hazard of Using the SQL Update Statement

  • marlon.seton (9/19/2013)


    dwain.c (9/18/2013)


    marlon.seton (9/18/2013)


    Proiv

    http://www.proiv.com/

    We develop one code set for Windows, Sun & Linux with Oracle, then have to tweak it a bit to for Windows & SQLServer.

    Thanks! From your information it sounds like a pretty good tool.

    One can develop a system very quickly with it but you have to do some work to get good performance with relational databases.

    Good point. Presumably, one should not assume that just because a query runs well in Oracle it will also run well in SQL Server.


    My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]

    My thought question: Have you ever been told that your query runs too fast?

    My advice:
    INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
    The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

    Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
    Since random numbers are too important to be left to chance, let's generate some![/url]
    Learn to understand recursive CTEs by example.[/url]
    [url url=http://www.sqlservercentral.com/articles/St

  • dwain.c (9/19/2013)


    marlon.seton (9/19/2013)


    dwain.c (9/18/2013)


    marlon.seton (9/18/2013)


    Proiv

    http://www.proiv.com/

    We develop one code set for Windows, Sun & Linux with Oracle, then have to tweak it a bit to for Windows & SQLServer.

    Thanks! From your information it sounds like a pretty good tool.

    One can develop a system very quickly with it but you have to do some work to get good performance with relational databases.

    Good point. Presumably, one should not assume that just because a query runs well in Oracle it will also run well in SQL Server.

    There's that, but you also need to consider the way ProIV builds queries given the files involved. Sometimes you get much better performance writing a bespoke query and that can mean writing two, one for each database, given that things like SUBSTR[ING] are different. I'd say there would be a couple of days work involved in showing someone new to ProIV how to use it with a database.

  • Nice trick to troubshoot queries

  • I came to SQL Server from Oracle and the Update statement in SQL Server really surprised me. I don't use it for ETL because I'm joining on alternate keys and I like to know when I'm missing something in my join. Things change too, based on the way people use/configure systems and I wouldn't have even known this without that occasional failure in the program that causes me to investigate. How is the non-standard update considered a "feature"? Why would anyone want the "just pick one" approach for multiple rows returned to update the target?

  • I confessed! I have used the "DISTINCT" modified.

    Why? Because I have to crank out the code ASAP and I don't have the time to worry about the problems in your article.

    That being said, thanks for reminding me to be a better programmer.

  • Could this article be renamed to 'One of the Millions of Problems of NOT Knowing Your Data?'

  • cbowman (4/6/2015)


    Could this article be renamed to 'One of the Millions of Problems of NOT Knowing Your Data?'

    Exactly. 🙂 +1000

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • I too have come to SQL Server from Oracle and although I don't have near the experience with it that most of you do, I absolutely consider this a bug and not a feature. The reason this is a bug is that it allows the result of the update to be non-determinant. Why would I want to allow the table to be changed in a way that results in unpredictable outcome?

    This is of course, my opinion.

    John

  • I enabled Change Data Capture recently and I *WANT* to apply each update in the CDC tables to "ETL" tables that mirror the original, but I ran into this issue of only being able to update one row, the first one found to be exact. I resorted to a cursor to loop though the CDC rows one at a time but I wonder if anyone has run into this same problem. I want to do this with SQL not SSIS. Thanks.

  • In response to the various references to using DISTINCT, keep in mind that DISTINCT will only save you if the values are duplicates. Once the values diverge, DISTINCT won't keep your MERGE from erroring and it won't keep your UPDATE from picking one at random. Much better to use partitioning to deterministically select one of the values (and to have a defensible criteria for selecting which value).

    Personally, I think DISTINCT (outside of ETL scenarios) is a potential code-smell. I avoid using it in the final SELECT clause of a query and strive only to use it in well-scoped sub-queries where I understand exactly why I am using it. Even then I think DISTINCT is really a shortcut and should probably be thought about as a shortcut for a more verbose form that expresses the actual intent.

    For instance, let's assume I have a M:M join table and I want to identify all the records on one side that have joins in the join table, so I do something like:

    SELECT DISTINCT Col2 From JoinTable;

    An alternative way of looking at that would be to do something like this:

    SELECT Col2 FROM Table2 WHERE EXISTS (

    SELECT * FROM JoinTable WHERE Table2.Col2 = JoinTable.Col2

    );

    The point it, what I'm really asking for is a list of all of the Col2 values (which should be the primary key of a table somewhere) that have one or more matching records in the JoinTable. In general, looking for DISTINCT values that aren't foreign keys is a hint that there may be a missing table in your schema. The fact that you are looking for DISTINCT values indicates that the values reference distinct objects, and if you haven't surfaced those objects in your schema, you're missing something.

    I'll still use DISTINCT (it's a lot shorter than the alternative syntax), but I try to limit myself to scenarios where I could replace it with the alternate syntax.

    While I'm on this kick, keep in mind that GROUP BY can be just another way of saying DISTINCT! That means that using GROUP BY at the top level of a multi-join query is a major code-smell (and can result in inefficient queries). Ditto for UNION without ALL, etc.

  • DISTINCT is often very misused / misunderstood.

    We had a query tool users had access to, and very often someone would come to me with a query that 'didn't work right'.

    Many times they had used DISTINCT, and usually I would end up showing them they were missing a join between the tables.

    At least the tool had no update capabilities.

    Testing and understanding the data is a very important part of developing code in many cases.

    Sometimes taken a bit too lightly.

    Also helps in recovery when results are not as expected. :w00t:

    As well as figuring that out soon enough to be able to fix it.

    Good article.

    Although I tend to view it more as a feature.

  • Good article! You mention using the TRY/CATCH construct to try to prevent this behavior. I would be interested in seeing an example of this. Thanks!

  • Jody-317850 (4/6/2015)


    Good article! You mention using the TRY/CATCH construct to try to prevent this behavior. I would be interested in seeing an example of this. Thanks!

    I cover the basics of TRY/CATCH here:

    Logging and Error Handling for SQL Stored Procedures[/url]

    TRY/CATCH doesn't need to be in a SP but that's most often where you'll see it. I call it "the basics" because there's a lot more you can do with it than to display some simple debugging messages.


    My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]

    My thought question: Have you ever been told that your query runs too fast?

    My advice:
    INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
    The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

    Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
    Since random numbers are too important to be left to chance, let's generate some![/url]
    Learn to understand recursive CTEs by example.[/url]
    [url url=http://www.sqlservercentral.com/articles/St

  • All I can take from this article is the mantra "If you're going to write code then make make sure you write it properly, and test it properly"

    If you're worried about incomplete joins causing multiple rows then you have not written the SELECT properly to achieve your needs

  • djblyth (4/7/2015)


    All I can take from this article is the mantra "If you're going to write code then make make sure you write it properly, and test it properly"

    If you're worried about incomplete joins causing multiple rows then you have not written the SELECT properly to achieve your needs

    I would say that this is 100% true. Now if we could all just learn to write queries properly, there would be no need to identify how easily things can fail!


    My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]

    My thought question: Have you ever been told that your query runs too fast?

    My advice:
    INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
    The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

    Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
    Since random numbers are too important to be left to chance, let's generate some![/url]
    Learn to understand recursive CTEs by example.[/url]
    [url url=http://www.sqlservercentral.com/articles/St

Viewing 15 posts - 31 through 45 (of 56 total)

You must be logged in to reply to this topic. Login to reply