Group by

  • Comments posted to this topic are about the item Group by

    :-PManie Verster
    Developer
    Johannesburg
    South Africa

    I can do all things through Christ who strengthens me. - Holy Bible
    I am a man of fixed and unbending principles, the first of which is to be flexible at all times. - Everett Mckinley Dirkson (Well, I am trying. - Manie Verster)

  • Thanks for the question!

    I knew the correct answer, but picked the wrong one.. Argh. (I guess I need to pay more attention when I am multi-tasking.)

  • Got it correct...:-)

  • Good question!

    In addition to your explanation, here is WHY an alias can be used in the ORDER BY, but not in the GORUP BY:

    The correct result of any SQL query is defined in terms of the logical(*) processing order.

    1. FROM clause. Build an intermediate set as a copy of the single table in the FROM clause, or as the result of joining the multiple tables. Or supply a single-row result set with no columns if no FROM is specified.

    2. WHERE clause. For each row in the output of step 1, evaluate the WHERE condition. If the result is True, add the row to the intermediate result set that is the output of this step; if it evaluates to False or Unknown (remember three-valued logic!!), discard the row. If no WHERE clause is specified, all rows are assumed to pass the test.

    3. GROUP BY clause. Create a new intermediate result set by taking the output of step 2 and forming groups based on the columns or expressions in the GROUP BY expression. If no GROUP BY is present, but the query does include a HAVING clause and/or does use aggregate functions in the SELECT clause, all rows are combined into one single group (i.e., "GROUP BY ()" is assumed). Otherwise, no grouping is applied (all rows are retained as individual rows).

    4. HAVING clause. Similar to the WHERE clause, the expression in the HAVING clause is evaluated - but not for individual rows, but for each group in the output of step 3. If the condition evaluates to True, the group is inserted into the intermediate result set of this step; if it's False or Unknown, the entire group gets discarded.

    5. SELECT clause. Based on the remaining rows and groups in the output of step 4, the result set of the query is constructed. Each expression in the SELECT list becomes a column in the result set. Columns in the intermediate result set that are not included in the SELECT list are dropped (unless SELECT * is used, becuase that is a shorthand for "all columns in the intermediate result set").

    6. ORDER BY clause. Officially a non-relational operation, since relational sets are unordered by definition. But as a service for the front-end, the database can convert the result set ("set" - meaning unordered) to a result cursor ("cursor" - ordered collection). Officially, only columns or expressions that are in the SELECT clause can be used in the ORDER BY clause (as the other columns were dropped from the result, and hence not included in the input of this step). SQL Server does allow this. Logically, you can consider SQL Server adding some extra colums to the SELECT list, with an "invisible" attribute.

    Note the absence of subqueries in this list. Logically(*), whenever a subqueriy is encountered in the steps above, evaluation of the outer query is halted, the evaluation process for the subquery is evaluated (steps 1 through 5, as ORDER BY is not allowed in a subquery [for simplicity sake, I choose to not include the SQL Server dialect TOP ... ORDER BY in this description]), and when the result of the subquery is known, the processing of the outer query is resumed.

    (*) Note the term "logical". There is no requirement at all that queries are actually executed in this order; as long as the results match those that would be returned by following this order. In fact, one of the key components of SQL Server (and any RDBMS) is the optimizer, a component that considers many strategies to find the desired results and picks the one likely to be the "cheapest" in terms of execution time and resource usage.


    Hugo Kornelis, SQL Server/Data Platform MVP (2006-2016)
    Visit my SQL Server blog: https://sqlserverfast.com/blog/
    SQL Server Execution Plan Reference: https://sqlserverfast.com/epr/

  • Thanks for the question!

  • Hugo Kornelis (9/6/2010)


    Good question!

    In addition to your explanation, here is WHY an alias can be used in the ORDER BY, but not in the GORUP BY:

    Because Microsoft havent implemented it.

    We are allowed to use aliases in Group by (and other stuff) in the form of table aliases. Therefore there is no real reason for not allowing column aliases as well (well maybe a bit of performance).

  • This was removed by the editor as SPAM

  • tommyh (9/6/2010)


    Hugo Kornelis (9/6/2010)


    Good question!

    In addition to your explanation, here is WHY an alias can be used in the ORDER BY, but not in the GORUP BY:

    Because Microsoft havent implemented it.

    We are allowed to use aliases in Group by (and other stuff) in the form of table aliases. Therefore there is no real reason for not allowing column aliases as well (well maybe a bit of performance).

    And who is we?

    On topic: nice question about the basics. And thanks for the explanation in depth Hugo.

    Need an answer? No, you need a question
    My blog at https://sqlkover.com.
    MCSE Business Intelligence - Microsoft Data Platform MVP

  • da-zero (9/6/2010)


    tommyh (9/6/2010)


    Hugo Kornelis (9/6/2010)


    Good question!

    In addition to your explanation, here is WHY an alias can be used in the ORDER BY, but not in the GORUP BY:

    Because Microsoft havent implemented it.

    We are allowed to use aliases in Group by (and other stuff) in the form of table aliases. Therefore there is no real reason for not allowing column aliases as well (well maybe a bit of performance).

    And who is we?

    Hrmm having a hard time deciding if that is supposed to be sarcastic or not. But anyway.

    I used "we" to represent everyone using SQL. Honestly dont know how else i could have written that.

  • tommyh (9/6/2010)


    Hugo Kornelis (9/6/2010)


    Good question!

    In addition to your explanation, here is WHY an alias can be used in the ORDER BY, but not in the GORUP BY:

    Because Microsoft havent implemented it.

    No, because Microsoft has chosen not to go against the standard.

    From ISO/IEC 9075-2:2003 (E), 7.9 <group by clause>:

    "Syntax Rules

    1) Each <grouping column reference> shall unambiguously reference a column of the table resulting from

    the <from clause>. A column referenced in a <group by clause> is a grouping column."

    SQL Server does allow you to use expressions in the GROUP BY instead of only simple column references (which extends the standard but oes not violate it). Allowing column aliases would be a violation of the standard, and would break compatibility with ANSI-standard conformant products and code.

    We are allowed to use aliases in Group by (and other stuff) in the form of table aliases. Therefore there is no real reason for not allowing column aliases as well (well maybe a bit of performance).

    Of course we can use a table alias in the GROUP BY. That follows logically from the same logical query evaluation order described above. The FROM clause is evaluated first. Table aliases are assigned in the FROM clause. As a result, you can use them in alll other clauses (and, in fact, MUST use them - once you assign a table alias, you can no longer reference the table by its original table name).


    Hugo Kornelis, SQL Server/Data Platform MVP (2006-2016)
    Visit my SQL Server blog: https://sqlserverfast.com/blog/
    SQL Server Execution Plan Reference: https://sqlserverfast.com/epr/

  • Hugo Kornelis (9/6/2010)


    tommyh (9/6/2010)


    Hugo Kornelis (9/6/2010)


    Good question!

    In addition to your explanation, here is WHY an alias can be used in the ORDER BY, but not in the GORUP BY:

    Because Microsoft havent implemented it.

    No, because Microsoft has chosen not to go against the standard.

    I think we see "implemented" a bit differently. I see it as something not being in the software (regardless of reason). And as for their reason. Well maybe it was because they didnt want to go against the standard, maybe not. Microsoft aint that know to obay standards.

  • I got the answer right, after a long series of wrongs.

    Hugo Kornelis (9/6/2010)


    Good question!

    1. FROM clause. Build an intermediate set as a copy of the single table in the FROM clause, or as the result of joining the multiple tables. Or supply a single-row result set with no columns if no FROM is specified.

    Whoa!! Hugo, I always thought this behavior was from Visual Foxpro, to copy the entire table first before selecting the data.

    I always thought in SQL, the data was Selected first before Where and Group, etc. Is

  • Open Minded (9/6/2010)


    I got the answer right, after a long series of wrongs.

    Hugo Kornelis (9/6/2010)


    Good question!

    1. FROM clause. Build an intermediate set as a copy of the single table in the FROM clause, or as the result of joining the multiple tables. Or supply a single-row result set with no columns if no FROM is specified.

    Whoa!! Hugo, I always thought this behavior was from Visual Foxpro, to copy the entire table first before selecting the data.

    I always thought in SQL, the data was Selected first before Where and Group, etc. Is

    Just remember, the description I gave is for the logical processing order. Actual processing order is always different - the optimizer picks the "best" plan to get the results. Making a competely redundant copy of a table will never be included in that plan!


    Hugo Kornelis, SQL Server/Data Platform MVP (2006-2016)
    Visit my SQL Server blog: https://sqlserverfast.com/blog/
    SQL Server Execution Plan Reference: https://sqlserverfast.com/epr/

  • tommyh (9/6/2010)


    da-zero (9/6/2010)


    And who is we?

    Hrmm having a hard time deciding if that is supposed to be sarcastic or not. But anyway.

    I used "we" to represent everyone using SQL. Honestly dont know how else i could have written that.

    Whoops. My bad.

    I didn't see you were talking about table aliases instead of column aliases. So I thought you were referring to some database (other than SQL Server) where column aliases in the group by are possible.

    Guess I'm not fully awake yet...

    Need an answer? No, you need a question
    My blog at https://sqlkover.com.
    MCSE Business Intelligence - Microsoft Data Platform MVP

  • I parsed the query before running it.. and it gave me "Command(s) completed successfully.". Not sure why.. ?

Viewing 15 posts - 1 through 15 (of 26 total)

You must be logged in to reply to this topic. Login to reply