Writing the Correct Query is Important

  • Has this turned into SQLServerConfessional.com?

    Gaz

    -- Stop your grinnin' and drop your linen...they're everywhere!!!

  • Ed Wagner - Monday, January 16, 2017 9:41 AM

    Eric M Russell - Monday, January 16, 2017 8:34 AM

    SQL suffers from the same predicament as HTML and JavaScript; it's an accessible high-level language that's widely used by a lot of people, but only a few people actually know how to code it at a professional level. When I'm called in to help resolve a performance issue, it's typical for IT management and application developers to have this prejudicial assumption that the root cause has something to do with the configuration of the database server or network itself. The developers bristle when I suggest that there is something wrong with how the SQL itself is coded. After all, they self evaluate themselves as an 8 out of 10 when it comes to proficiency at SQL. At least initially, they dismiss my advice because they think I'm wasting time by nitpicking their coding style, but when they see how much impact in performance can be achieved simply by modifying a join or removing a user defined function from the where clause, they realize there is so much more to be known.

    Eric, you said a mouthful.  There are a lot of coders, but not a lot of people who understand how things work.  I'm not a consultant, but am asked to resolve performance problems internally.  Too many times I hear "you're right that the structure needs work, but we have a lot of things built around it so we can't change it" as the reason for not making real change.  The code is most often the source of the problem (the scalar function in the WHERE clause like you said or an implicit cast on a join) but the underlying design is sometimes the limit on making huge improvements.

    I also encounter the "hey, it works" syndrome where people get defensive about their code and don't believe it can be made any faster.  When it is made 30x faster, they think there's a trick somewhere.  Naturally, it couldn't be their cursor.  Sadly, I know I'll encounter the same code and the same attitude again soon.

    And yet when a DBA suggests that a :Whistling:senior developer ought to use the .net StringBuilder or pre-size an array rather than growing it 10K + times then somehow they still know best and "coding is a lot more difficult than select * ...".

    Oh well, got to get my amusement somewhere 🙂

  • Yet Another DBA - Wednesday, January 18, 2017 12:34 AM

    Ed Wagner - Monday, January 16, 2017 9:41 AM

    Eric M Russell - Monday, January 16, 2017 8:34 AM

    SQL suffers from the same predicament as HTML and JavaScript; it's an accessible high-level language that's widely used by a lot of people, but only a few people actually know how to code it at a professional level. When I'm called in to help resolve a performance issue, it's typical for IT management and application developers to have this prejudicial assumption that the root cause has something to do with the configuration of the database server or network itself. The developers bristle when I suggest that there is something wrong with how the SQL itself is coded. After all, they self evaluate themselves as an 8 out of 10 when it comes to proficiency at SQL. At least initially, they dismiss my advice because they think I'm wasting time by nitpicking their coding style, but when they see how much impact in performance can be achieved simply by modifying a join or removing a user defined function from the where clause, they realize there is so much more to be known.

    Eric, you said a mouthful.  There are a lot of coders, but not a lot of people who understand how things work.  I'm not a consultant, but am asked to resolve performance problems internally.  Too many times I hear "you're right that the structure needs work, but we have a lot of things built around it so we can't change it" as the reason for not making real change.  The code is most often the source of the problem (the scalar function in the WHERE clause like you said or an implicit cast on a join) but the underlying design is sometimes the limit on making huge improvements.

    I also encounter the "hey, it works" syndrome where people get defensive about their code and don't believe it can be made any faster.  When it is made 30x faster, they think there's a trick somewhere.  Naturally, it couldn't be their cursor.  Sadly, I know I'll encounter the same code and the same attitude again soon.

    And yet when a DBA suggests that a :Whistling:senior developer ought to use the .net StringBuilder or pre-size an array rather than growing it 10K + times then somehow they still know best and "coding is a lot more difficult than select * ...".

    Oh well, got to get my amusement somewhere 🙂

    Sometimes "Senior [Insert Job Role Here]" = "Ambitious Muppet with Delusions of Adequacy"

    Gaz

    -- Stop your grinnin' and drop your linen...they're everywhere!!!

  • I once worked with an arrogant developer who nulled out a column on thousands of rows with a badly written update statement in Oracle. His response was to leave the data as it was, saying it would have worked correctly in SQL Server. A bit like this:

  • paul s-306273 - Wednesday, January 18, 2017 4:20 AM

    I once worked with an arrogant developer who nulled out a column on thousands of rows with a badly written update statement in Oracle. His response was to leave the data as it was, saying it would have worked correctly in SQL Server. A bit like this:

    For anyone wanting to follow Paul's link concatenate the following two lines into one and paste in a browser address bar:

    https://asktom.oracle.com/pls/asktom/f?p=100:11:0::NO::::
    P11_QUESTION_ID:273215737113

    This is the forum changing a colon and capital P to an emoticon!!!

    Gaz

    -- Stop your grinnin' and drop your linen...they're everywhere!!!

  • Yet Another DBA - Wednesday, January 18, 2017 12:34 AM

    Ed Wagner - Monday, January 16, 2017 9:41 AM

    Eric M Russell - Monday, January 16, 2017 8:34 AM

    And yet when a DBA suggests that a :Whistling:senior developer ought to use the .net StringBuilder or pre-size an array rather than growing it 10K + times then somehow they still know best and "coding is a lot more difficult than select * ...".

    Oh well, got to get my amusement somewhere 🙂

    I wont touch .Net

  • I once inadvertently coded an update script that looked something like this:

    SET @User = 1234;

    UPDATE VeryBigTable
    SET Status = 0
    WHERE User = User;

    From that point forward, I add a block of code that basically performs an assertion and then does a rollback + error if the actual number of rows affected falls outside an expected threshold. 

    BEGIN TRAN;
       UPDATE ... ;
    IF @@ROWCOUNT > 10000
    BEGIN
       ROLLBACK TRAN;
       ERRORAISE ... ;
    ELSE
       COMMIT TRAN;

    One nice to have T-SQL extension would be a way of declaring a rollback assertion within the DML statement itself, perhaps as a hint. If it would also leverage statistics to simply shortcut the DML operation without even starting it, then that would be even better.

    For example:

    UPDATE ... OPTION ( ASSERT MAX_ROWS_AFFECTED(10000) );

    "Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho

  • Eric M Russell - Wednesday, January 18, 2017 7:46 AM

    I once inadvertently coded an update script that looked something like this:

    SET @User = 1234;

    UPDATE VeryBigTable
    SET Status = 0
    WHERE User = User;

    From that point forward, I add a block of code that basically performs an assertion and then does a rollback + error if the actual number of rows affected falls outside an expected threshold. 

    BEGIN TRAN;
       UPDATE ... ;
    IF @@ROWCOUNT > 10000
    BEGIN
       ROLLBACK TRAN;
       ERRORAISE ... ;
    ELSE
       COMMIT TRAN;

    One nice to have T-SQL extension would be a way of declaring a rollback assertion within the DML statement itself, perhaps as a hint. If it would also leverage statistics to simply shortcut the DML operation without even starting it, then that would be even better.

    For example:

    UPDATE ... OPTION ( ASSERT MAX_ROWS_AFFECTED(10000) );

    Instead of littering your SQL with "maybe I am crazy" code, a better approach might be some sort of testing method or find out why you wrote that code to begin with (too tired?  stress? working too fast? ready to retire?).  Sometimes for testing I will wrap a bunch of scary code in a transaction and then call rollback at the end, before I call rollback I run some queries to see if my code worked.

  • billm 56127 - Wednesday, January 18, 2017 8:19 AM

    Eric M Russell - Wednesday, January 18, 2017 7:46 AM

    I once inadvertently coded an update script that looked something like this:

    SET @User = 1234;

    UPDATE VeryBigTable
    SET Status = 0
    WHERE User = User;

    From that point forward, I add a block of code that basically performs an assertion and then does a rollback + error if the actual number of rows affected falls outside an expected threshold. 

    BEGIN TRAN;
       UPDATE ... ;
    IF @@ROWCOUNT > 10000
    BEGIN
       ROLLBACK TRAN;
       ERRORAISE ... ;
    ELSE
       COMMIT TRAN;

    One nice to have T-SQL extension would be a way of declaring a rollback assertion within the DML statement itself, perhaps as a hint. If it would also leverage statistics to simply shortcut the DML operation without even starting it, then that would be even better.

    For example:

    UPDATE ... OPTION ( ASSERT MAX_ROWS_AFFECTED(10000) );

    Instead of littering your SQL with "maybe I am crazy" code, a better approach might be some sort of testing method or find out why you wrote that code to begin with (too tired?  stress? working too fast? ready to retire?).  Sometimes for testing I will wrap a bunch of scary code in a transaction and then call rollback at the end, before I call rollback I run some queries to see if my code worked.

    An assertion isn't just for coding mistakes or one-off DML scripts. It can also be used for stored procedures to cover something like invalid parameters or an unexpected behavior resulting from certain parameter combinations that would affect an unusually large number of rows. Just like a foreign key or referential integrity constraint, a row count assertion is also a form of declarative documentation, saying "FYI: No more than X number of rows should be updated by this operation.".

    "Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho

  • billm 56127 - Wednesday, January 18, 2017 8:19 AM

    Eric M Russell - Wednesday, January 18, 2017 7:46 AM

    I once inadvertently coded an update script that looked something like this:

    SET @User = 1234;

    UPDATE VeryBigTable
    SET Status = 0
    WHERE User = User;

    From that point forward, I add a block of code that basically performs an assertion and then does a rollback + error if the actual number of rows affected falls outside an expected threshold. 

    BEGIN TRAN;
       UPDATE ... ;
    IF @@ROWCOUNT > 10000
    BEGIN
       ROLLBACK TRAN;
       ERRORAISE ... ;
    ELSE
       COMMIT TRAN;

    One nice to have T-SQL extension would be a way of declaring a rollback assertion within the DML statement itself, perhaps as a hint. If it would also leverage statistics to simply shortcut the DML operation without even starting it, then that would be even better.

    For example:

    UPDATE ... OPTION ( ASSERT MAX_ROWS_AFFECTED(10000) );

    Instead of littering your SQL with "maybe I am crazy" code, a better approach might be some sort of testing method or find out why you wrote that code to begin with (too tired?  stress? working too fast? ready to retire?).  Sometimes for testing I will wrap a bunch of scary code in a transaction and then call rollback at the end, before I call rollback I run some queries to see if my code worked.

    The first time I run a query that updates or deletes, I'll include an Output statement so I can look at what I'm touching and include a Rollback statement.  If the data from the Output statement looks right, I'll change the rollback to a commit and re-execute.

  • Marcia J - Friday, March 10, 2017 8:44 AM

    The first time I run a query that updates or deletes, I'll include an Output statement so I can look at what I'm touching and include a Rollback statement.  If the data from the Output statement looks right, I'll change the rollback to a commit and re-execute.

    Here is how I sometimes unit test a stored procedure. I only do this in development of course.

    begin tran;
    exec mynewproc ...
    select xyz to confirm it's updating rows, returning resultset, etc. correctly.
    rollback tran;

    "Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho

  • billm 56127 - Monday, January 16, 2017 8:42 AM

    Sometimes a developer made a mistake due to not really being a good developer.  For example, how many days are there from May 10th to May 15th?  There are 6 days, not 15-10 days.  Another example is not putting parens around an OR condition.

    Or maybe there are only 4 days: does "from...to..." mean "between... and ... inclusive" or "between ...nd ... exclusive" or "on or after...and before ..." or "after... and on or before..."?  The developer may be being screwed by a sloppy definition of the requirement, like the one you have suggested, which has 2 interpretations out of 4 that agree with what you claim it would be wrong for the developer to conclude from it and a third that differs even more from what you intended. 

    I've spent a lot of time trying to get people to make their requirements specific, precise and unambiguous instead of blaming a developer for interpreting there ambiguous and imprecise non-specifications in ways they didn't intend.  In my experience that happens more often than developers getting it wrong, and many of the people producing these specifications fail to understand (even when it's pointed out to them by very senior management) that it's their job to be unambiguous.
    And was that "OR condition" part of a complex sentence that indicates clearly where the various ors and ands and nots were bounded or was it another failure to specify unambiguously?

    Tom

  • billm 56127 - Wednesday, January 18, 2017 8:19 AM

    Eric M Russell - Wednesday, January 18, 2017 7:46 AM

    I once inadvertently coded an update script that looked something like this:

    SET @User = 1234;

    UPDATE VeryBigTable
    SET Status = 0
    WHERE User = User;

    From that point forward, I add a block of code that basically performs an assertion and then does a rollback + error if the actual number of rows affected falls outside an expected threshold. 

    BEGIN TRAN;
       UPDATE ... ;
    IF @@ROWCOUNT > 10000
    BEGIN
       ROLLBACK TRAN;
       ERRORAISE ... ;
    ELSE
       COMMIT TRAN;

    One nice to have T-SQL extension would be a way of declaring a rollback assertion within the DML statement itself, perhaps as a hint. If it would also leverage statistics to simply shortcut the DML operation without even starting it, then that would be even better.

    For example:

    UPDATE ... OPTION ( ASSERT MAX_ROWS_AFFECTED(10000) );

    Instead of littering your SQL with "maybe I am crazy" code, a better approach might be some sort of testing method or find out why you wrote that code to begin with (too tired?  stress? working too fast? ready to retire?).  Sometimes for testing I will wrap a bunch of scary code in a transaction and then call rollback at the end, before I call rollback I run some queries to see if my code worked.

    I agree with Eric.   Dinning the idea that error management should have error detection and error containment as its first priorities into the heads of dumb/green developers and dumb/green DBAs and dumb/green sysadmins has been a pain because it's difficult (they all think they never make mistakes until they learn otherwise) but it has saved me a lot of trouble by getting them to protect the system as best they can from their own errors.  I was glad to see try...catch added to SQL, but I would like to see "assert" (which throws an error if the condition asserted is false, so is pretty much the standard was of doing what Eric describes once one has try...catch) added as well, preferably as condition built a statement as Eric suggests (but I'd prefer "ROWS_AFFECTED()<10001" to Eric's version, as I want to be able to make more complex assertions using logic operators to combine conditions).

    Tom

Viewing 13 posts - 31 through 42 (of 42 total)

You must be logged in to reply to this topic. Login to reply