Stairway to T-SQL: Beyond The Basics Level 8: Coding Shortcuts using += and -= Operators

  • Jeff Moden (7/15/2016)


    asutorius (7/15/2016)


    I was curious how these two concatenate examples compared in speed, I placed the examples in their own tabs in my local Sql Server 2014 and turned on Include Client Side Statistics for each example, then ran them.

    The non-shorthand example beat the shorthand in Client Execution Time, Bytes received from server, Total Execution time, and Wait time on server replies.

    From that I'm inferring that just because you type less characters, doesn't mean it's the best solution.

    Now I'm wondering if I'm seeing the stats correctly, do y'all get those results too?

    Can you post your test harness so that we can duplicate your tests?

    I ran my own test. It's not all encompassing, to be sure. But, after several dozen runs, they each take turns winning by the same amount. Here's the code.

    SET STATISTICS TIME ON;

    DECLARE @Variable INT;

    SELECT @Variable = 0;

    SELECT TOP 10000000

    @Variable = @Variable + 1

    FROM sys.all_columns ac1

    CROSS JOIN sys.all_columns ac2

    ;

    SET STATISTICS TIME OFF;

    GO

    PRINT REPLICATE('-',80)

    GO

    SET STATISTICS TIME ON;

    DECLARE @Variable INT;

    SELECT @Variable = 0;

    SELECT TOP 10000000

    @Variable += 1

    FROM sys.all_columns ac1

    CROSS JOIN sys.all_columns ac2

    ;

    SET STATISTICS TIME OFF;

    My test was run on 2008 (non-R2) Enterprise Edition on a quad core i5 laptop.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Adding to Joe's statements, the portability of standard-SQL skill set of developers plays an important role in determining a major portion of project costs. Also, not forgetting the need to re-write code due to vendor-special syntax being deprecated or vendor-internal features being changed on new releases.

    See ISO/IEC 9075-4 Persistent Stored Modules (SQL/PSM) that addresses the makeup of stored procedures/modules by specifying the syntax and semantics of SQL-statements for facilities like directing flow of control, assignment of result of expressions to variables and parameters, specification of condition handlers, cursors, local variables, including control statements etc.

    A summary of description of ISO SQL/PSM, along with other companion 14+ parts of the ISO SQL standard is described in "ISO 9075-1 Database languages -- SQL -- Part 1: Framework (SQL/Framework)", which is freely available. Companion ISO SQL Technical Reports parts ISO 19075 (parts 1 thru 4) can be useful as described in the freely available standards list.

    You can look up the ISO list of freely available ISO standards at:

    http://standards.iso.org/ittf/PubliclyAvailableStandards/index.html

    Not sure I buy the metaphor of scientific calculator vs. basic calculator---does not apply.

  • babap (7/18/2016)


    Not sure I buy the metaphor of scientific calculator vs. basic calculator---does not apply.

    In most cases, neither does portability. 😉 Not everyone needs applications that will work on any and all RDBMSs. I think it's seriously overstated.

    We have a 3rd party application in house where the database design, the backend code, and the front end code are all a resource intensive, slow performing piece of junk. Yeah, it does work. But when I talked with the lead developer about why he didn't use clustered indexes and why he didn't use CTEs and why they used cursors for a lot of things, he said it was so that they could easily migrate the product to Oracle. When I asked him when they were going to do that, he said they plan to start in 2 or 3 years.

    Portability is a myth. Learn how to properly abstract your code and use the best features of whatever RDBMS thr application is currently working with.

    And, it's way too easy to brush things off by simply saying "does not apply". 😉 You don't have to buy the metaphor. Buy the fact that as soon as you use variables in T-SQL, your code is no longer portable. I believe that most decent scientific calculators allow for the storage of more than one variable nowadays. How's that 4 function calculator doing? 😉

    To continue the calculator metaphor, there are two types of scientific calculators. "Regular" ones that follow common mathematical hierarchy and those like HP made that use "Reverse Polish Notation" (not a slam on people from Poland... that's the real name of the notation method). Both types of calculators are incredible in what they can do. But, the folks using the "Regular" ones have no clue how to make proper entries on the RPN calculators making them virtually useless to them until they learn how to use the features. Does that mean that the folks using the RPN calculators should switch to "Regular" scientific calculators? No way. Those folks can do a much better job than being forced to change.

    Same goes with the various extensions to SQL in various RDBMSs.

    Use the full functionality of the tools you have. They both will return the correct answers much more quickly and effectively for the same questions. Learn to abstract your code instead of relying on the myth of portability.

    Also remember that there are two types of SQL; that which must return values to the front end and that which must do the really heavy lifting for batch jobs and data warehousing. If you relegate the heavy lifting to ANSI/ISO-only code, then you're like the RPN calculator user being forced to do cube roots on a 4 function calculator. You can do it but it's going to take some extra knowledge and a lot of extra time.

    Log and Trig tables, anyone? 😛

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • I think this is a false analogy. Numbers are numbers and your arguing the interface (infix versus RPN) somehow makes them different. I also think that cursor argument your client gave is incredibly stupid. Oracle actually has better support for CTEs and more advanced features than SQL Server.

    I also see a lot more kinds of SQL than just to. In memory, columnar, data warehouse, big data (Terra data that that sort of thing), streaming data, etc. the genius of SQL is that it is detached from hardware. You do not have learn a new language. When you move from one product to the other, if stick to portable code. Remember the "Psychology of Computer Programming" by Weinberg? He showed that if you start off with a particular goal, the code will meet that goal. The problem is it very often meets that goal the expense of everything else.

    Oh, I do use lookup tables 🙂 there is no way in hell I am going to write a student T-distribution in SQL. I would not even try to code a Chebyshev polynomial in SQL; I just do not trust SQL engine to have floating-point corrections built in it. I also have a copy of "Interpolation and Approximation" by Philip J. Davis (http://www.doverpublications.com) just in case I need to do some fancy math.

    Books in Celko Series for Morgan-Kaufmann Publishing
    Analytics and OLAP in SQL
    Data and Databases: Concepts in Practice
    Data, Measurements and Standards in SQL
    SQL for Smarties
    SQL Programming Style
    SQL Puzzles and Answers
    Thinking in Sets
    Trees and Hierarchies in SQL

  • Any difference in exeution time would be the result of T-SQL syntax pre-processing; one is probably just a macro of the other. All other things being equal, the actual compiled execution plans of two queries using either method will be identical and cached for re-use.

    "Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho

  • CELKO (7/18/2016)


    Numbers are numbers and your arguing the interface (infix versus RPN) somehow makes them different.

    Quite the contrary. I'm arguing that you should use whatever features are available in both types of calculators and let the operator of each abstract the answer so that others can understand it.

    I also think that cursor argument your client gave is incredibly stupid.

    Heh... finally... after more than a decade, you and I finally agree on something. 😀

    You also haven't answered my question. What does ANSI say about stored procedures that take parameters? Whatever it says doesn't actually matter because most RDBMS systems have different notations for variables and that alone (and there's so much more) makes portability a myth.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • You can get a full or near full implementation of the PSM in DB2, PostgreSQL, MySQL, Mimer and a few other products.

    The SQL/PSM was primarily written by a guy would worked deck and then moved on and Oracle, Andrew Eisenberg. He had been overly impressed by ADA. Since I actually had to work with ADA my career, I really hated it.

    I need to back up a little bit and give you some terms. An argument is the formal placeholder in a procedure declaration; a parameter is the actual value that gets plugged in when the procedures invoked. The the arguments have to be valid SQL datatypes (no surprise here), but they are prefixed with the keywords IN, OUT and IN OUT. Remember the old Algol and Fortran procedures and subroutines? Pass by name, passed by value, etc.? The IN the end keyword says that the parameters value is passed to the procedure; the OUT says that the parameter receives a value from the procedure; the IN OUT goes both ways.

    After the argument list, there is a bunch of clauses to tell the compiler how the procedure is going to work. You tell whether it is deterministic or not, which language it is written in (no, it does not have to be in SQL) and some other options. Determinism is really important for the optimizer.

    The body of the procedure is block structured, based on the ADA model. The start of the block has local declarations, the middle is the actual code, and the last part of it is the exception handling. When exceptions raised control jumps down to that section does what is ever there and then comes back to where it left off.

    The control structures use a terminating keyword; IF-THEN-ELSE-END IF. There is a while loop, for loop, CASE statement (no exression!) and some other stuff, which frankly I cannot remember. But it has no real surprises.

    Oracle has their PL/SQL which is virtually identical to the PSM. DB2 can compile either of the languages, and can even take SQL Server T-SQL that uses extended equality. I do not know how good the translations are, at least most of the work is done by machine and not by hand coding.

    Books in Celko Series for Morgan-Kaufmann Publishing
    Analytics and OLAP in SQL
    Data and Databases: Concepts in Practice
    Data, Measurements and Standards in SQL
    SQL for Smarties
    SQL Programming Style
    SQL Puzzles and Answers
    Thinking in Sets
    Trees and Hierarchies in SQL

  • CELKO (7/18/2016)


    You can get a full or near full implementation of the PSM in DB2, PostgreSQL, MySQL, Mimer and a few other products.

    The SQL/PSM was primarily written by a guy would worked deck and then moved on and Oracle, Andrew Eisenberg. He had been overly impressed by ADA. Since I actually had to work with ADA my career, I really hated it.

    I need to back up a little bit and give you some terms. An argument is the formal placeholder in a procedure declaration; a parameter is the actual value that gets plugged in when the procedures invoked. The the arguments have to be valid SQL datatypes (no surprise here), but they are prefixed with the keywords IN, OUT and IN OUT. Remember the old Algol and Fortran procedures and subroutines? Pass by name, passed by value, etc.? The IN the end keyword says that the parameters value is passed to the procedure; the OUT says that the parameter receives a value from the procedure; the IN OUT goes both ways.

    After the argument list, there is a bunch of clauses to tell the compiler how the procedure is going to work. You tell whether it is deterministic or not, which language it is written in (no, it does not have to be in SQL) and some other options. Determinism is really important for the optimizer.

    The body of the procedure is block structured, based on the ADA model. The start of the block has local declarations, the middle is the actual code, and the last part of it is the exception handling. When exceptions raised control jumps down to that section does what is ever there and then comes back to where it left off.

    The control structures use a terminating keyword; IF-THEN-ELSE-END IF. There is a while loop, for loop, CASE statement (no exression!) and some other stuff, which frankly I cannot remember. But it has no real surprises.

    Oracle has their PL/SQL which is virtually identical to the PSM. DB2 can compile either of the languages, and can even take SQL Server T-SQL that uses extended equality. I do not know how good the translations are, at least most of the work is done by machine and not by hand coding.

    It's a funny thing... I call them "arguments" and someone presumes to correct me to say "parameters". Then, when I call them "parameters" like most do, someone presumes to correct me to say "arguments". The really funny thing on the part of all those presumptive parties is they all knew exactly what I was talking about. 😉

    I do agree with your definitions because those are the correct ones. Now, just imagine if we started talking about operands. 😛

    And you still haven't answered my question.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Maybe I missed the question, but SQL/PSM is how the ANSI/ISO standards handle procedures and functions. Also triggers, God help us.

    As an aside, the "parameter versus argument" distinction becomes important when you pass a constant, a function call, or an expression that resolves to a scaler to a procedure. Algol was the first language to allow this sort of thing, and it required creating what they called a "thunk" to do the calculation.

    Books in Celko Series for Morgan-Kaufmann Publishing
    Analytics and OLAP in SQL
    Data and Databases: Concepts in Practice
    Data, Measurements and Standards in SQL
    SQL for Smarties
    SQL Programming Style
    SQL Puzzles and Answers
    Thinking in Sets
    Trees and Hierarchies in SQL

  • CELKO (7/19/2016)


    Maybe I missed the question, but SQL/PSM is how the ANSI/ISO standards handle procedures and functions. Also triggers, God help us.

    As an aside, the "parameter versus argument" distinction becomes important when you pass a constant, a function call, or an expression that resolves to a scaler to a procedure. Algol was the first language to allow this sort of thing, and it required creating what they called a "thunk" to do the calculation.

    The question was, how do you define the arguments of a stored procedure in ANSI/ISO? If by using variables, then that pretty much shoots portability in the face.

    On the "parameter versus argument" distinction, you're preaching to the choir because I absolutely agree. 😉

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • how do you define the arguments of a stored procedure in ANSI/ISO? If by using variables, then that pretty much shoots portability in the face.

    It is actually much more complicated than simple variables. Remember SQL actually stands for "Scarcely Qualifies as a Language" because it was meant to be embedded in a host program 🙂

    The procedure body starts with several optional clauses, but the ones were interested in are:

    LANGUAGE [SQL | <ANSI standard language> | <vendor supported language>]

    Which tells us what language the body of the procedure is written in. This signals the compiler to convert the incoming SQL parameters to the procedure body language, the outgoing host language variables to SQL, and the in out parameters both ways. The standards have defined the castings since the 1992 standards (for example, strings in C have to be nul terminated, but carry a length as an integer at the front in other languages).

    [NOT] DETERMINISTIC

    Tells the compiler that the procedure body is or is not deterministic. This lets the optimizer do quite a lot of nice things. My favorite is to create a working lookup table as procedure is executed. This lookup table can then be used to replace calls to the procedure that use a previously defined parameter value. It also tells the optimizer that it is free to rearrange the code in some ways.

    [CONTAINS SQL | READS SQL DATA | MODIFIES SQL DATA]

    This clause sort of explains itself as it passes along a lot of information to the compiler and the optimizer.

    [IN | OUT | INOUT]

    This tells the compiler how the argument will be used, so we can do conversions if needed. Again, various optimizations can be done, if you know this.

    RETURNS TABLE (<column declaration list >)

    The CREATE FUNCTION statement follows the same syntax, but includes a RETURNS <datatype > clause

    Books in Celko Series for Morgan-Kaufmann Publishing
    Analytics and OLAP in SQL
    Data and Databases: Concepts in Practice
    Data, Measurements and Standards in SQL
    SQL for Smarties
    SQL Programming Style
    SQL Puzzles and Answers
    Thinking in Sets
    Trees and Hierarchies in SQL

  • Eric, thank you for answering my question.

  • Eric M Russell (7/18/2016)


    Any difference in exeution time would be the result of T-SQL syntax pre-processing; one is probably just a macro of the other. All other things being equal, the actual compiled execution plans of two queries using either method will be identical and cached for re-use.

    Eric thank you for answering my question.

  • CELKO (7/20/2016)


    how do you define the arguments of a stored procedure in ANSI/ISO? If by using variables, then that pretty much shoots portability in the face.

    It is actually much more complicated than simple variables. Remember SQL actually stands for "Scarcely Qualifies as a Language" because it was meant to be embedded in a host program 🙂

    The procedure body starts with several optional clauses, but the ones were interested in are:

    LANGUAGE [SQL | <ANSI standard language> | <vendor supported language>]

    Which tells us what language the body of the procedure is written in. This signals the compiler to convert the incoming SQL parameters to the procedure body language, the outgoing host language variables to SQL, and the in out parameters both ways. The standards have defined the castings since the 1992 standards (for example, strings in C have to be nul terminated, but carry a length as an integer at the front in other languages).

    [NOT] DETERMINISTIC

    Tells the compiler that the procedure body is or is not deterministic. This lets the optimizer do quite a lot of nice things. My favorite is to create a working lookup table as procedure is executed. This lookup table can then be used to replace calls to the procedure that use a previously defined parameter value. It also tells the optimizer that it is free to rearrange the code in some ways.

    [CONTAINS SQL | READS SQL DATA | MODIFIES SQL DATA]

    This clause sort of explains itself as it passes along a lot of information to the compiler and the optimizer.

    [IN | OUT | INOUT]

    This tells the compiler how the argument will be used, so we can do conversions if needed. Again, various optimizations can be done, if you know this.

    RETURNS TABLE (<column declaration list >)

    The CREATE FUNCTION statement follows the same syntax, but includes a RETURNS <datatype > clause

    I understand and appreciate all of that but it still evades the basic question. How are the variables that are used to define the arguments for a stored procedure actually specified according to ANSI/ISO standards? Whether they specify that they must begin with an "@" sign or not, not all systems follow either and that makes the possibility of writing truly portable code a myth all by itself. There are plenty more nuances that destroy true portability.

    Code for the best usage of whatever RDBMS engine you're currently using. If you need to support more than one, make sure you properly abstract your code so you can use the best of whatever engine you're using instead of being crippled by a standard that very few RDBMSs are fully compatible with.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • >> I understand and appreciate all of that but it still evades the basic question. How are the variables that are used to define the arguments for a stored procedure actually specified according to ANSI/ISO standards? Whether they specify that they must begin with an "@" sign or not, not all systems follow either and that makes the possibility of writing truly portable code a myth all by itself. There are plenty more nuances that destroy true portability. <<

    Short answer: variables are not used for the arguments. Instead we have "thunks"; they come in and out of the procedure body is SQL datatypes; they get cast and converted to the data type of the procedure body language. This makes SQL/PSM highly portable. Since SQL started off as an embedded language, the standards want to be able to get to whatever host language is embedded in

    Then how is DB2 version 11 able to translate Oracle, DB2, and even the old SQL server with extended equality operators? Little trivia thing here; the @is to make the grammar a simple one pass compiler. It is a version of what we had to do in Fortran I with the letters I through N to signal that a variable was an integer. This prefix trick also shows up on very early operating systems.

    >> Code for the best usage of whatever RDBMS engine you're currently using. If you need to support more than one, make sure you properly abstract your code so you can use the best of whatever engine you're using instead of being crippled by a standard that very few RDBMSs are fully compatible with. <<

    You do not have to write with the full ANSI/ISO current SQL standard. You simply have to find a common subset that is implemented well enough that you can port it. This is what the TPC does with their test suites. This is what they have actually done for quite a few years now. And their test suites are not simple! Frankly, in the real world, the SQL–92 standard still does most of the work in a well-written system. It has enough power to do simple well-defined modules that make up the bulk of commercial data processing. Think about it; how much of your commercial work in COBOL, VB, etc. really uses fancy features?

    Books in Celko Series for Morgan-Kaufmann Publishing
    Analytics and OLAP in SQL
    Data and Databases: Concepts in Practice
    Data, Measurements and Standards in SQL
    SQL for Smarties
    SQL Programming Style
    SQL Puzzles and Answers
    Thinking in Sets
    Trees and Hierarchies in SQL

Viewing 15 posts - 16 through 30 (of 34 total)

You must be logged in to reply to this topic. Login to reply