Group by

Question

Group by

Manie Verster

SSCertifiable

Points: 7152
More actions
September 5, 2010 at 1:49 am

#239941

Comments posted to this topic are about the item Group by
:-PManie Verster
Developer
Johannesburg
South Africa
I can do all things through Christ who strengthens me. - Holy Bible
I am a man of fixed and unbending principles, the first of which is to be flexible at all times. - Everett Mckinley Dirkson (Well, I am trying. - Manie Verster)

Viewing 15 posts - 1 through 15 (of 26 total)

You must be logged in to reply to this topic. Login to reply

UMG Developer SSChampion Points: 13482 More actions · Answer 1

Thanks for the question!

I knew the correct answer, but picked the wrong one.. Argh. (I guess I need to pay more attention when I am multi-tasking.)

manoj.kumar-1078069 Valued Member Points: 59 More actions · Answer 2

manoj.kumar-1078069

Valued Member

Points: 59

September 5, 2010 at 9:20 pm

#1216754

Got it correct...:-)

Hugo Kornelis SSC Guru Points: 64789 More actions · Answer 3

Good question!

In addition to your explanation, here is WHY an alias can be used in the ORDER BY, but not in the GORUP BY:

The correct result of any SQL query is defined in terms of the logical(*) processing order.

1. FROM clause. Build an intermediate set as a copy of the single table in the FROM clause, or as the result of joining the multiple tables. Or supply a single-row result set with no columns if no FROM is specified.

2. WHERE clause. For each row in the output of step 1, evaluate the WHERE condition. If the result is True, add the row to the intermediate result set that is the output of this step; if it evaluates to False or Unknown (remember three-valued logic!!), discard the row. If no WHERE clause is specified, all rows are assumed to pass the test.

3. GROUP BY clause. Create a new intermediate result set by taking the output of step 2 and forming groups based on the columns or expressions in the GROUP BY expression. If no GROUP BY is present, but the query does include a HAVING clause and/or does use aggregate functions in the SELECT clause, all rows are combined into one single group (i.e., "GROUP BY ()" is assumed). Otherwise, no grouping is applied (all rows are retained as individual rows).

4. HAVING clause. Similar to the WHERE clause, the expression in the HAVING clause is evaluated - but not for individual rows, but for each group in the output of step 3. If the condition evaluates to True, the group is inserted into the intermediate result set of this step; if it's False or Unknown, the entire group gets discarded.

5. SELECT clause. Based on the remaining rows and groups in the output of step 4, the result set of the query is constructed. Each expression in the SELECT list becomes a column in the result set. Columns in the intermediate result set that are not included in the SELECT list are dropped (unless SELECT * is used, becuase that is a shorthand for "all columns in the intermediate result set").

6. ORDER BY clause. Officially a non-relational operation, since relational sets are unordered by definition. But as a service for the front-end, the database can convert the result set ("set" - meaning unordered) to a result cursor ("cursor" - ordered collection). Officially, only columns or expressions that are in the SELECT clause can be used in the ORDER BY clause (as the other columns were dropped from the result, and hence not included in the input of this step). SQL Server does allow this. Logically, you can consider SQL Server adding some extra colums to the SELECT list, with an "invisible" attribute.

Note the absence of subqueries in this list. Logically(*), whenever a subqueriy is encountered in the steps above, evaluation of the outer query is halted, the evaluation process for the subquery is evaluated (steps 1 through 5, as ORDER BY is not allowed in a subquery [for simplicity sake, I choose to not include the SQL Server dialect TOP ... ORDER BY in this description]), and when the result of the subquery is known, the processing of the outer query is resumed.

(*) Note the term "logical". There is no requirement at all that queries are actually executed in this order; as long as the results match those that would be returned by following this order. In fact, one of the key components of SQL Server (and any RDBMS) is the optimizer, a component that considers many strategies to find the desired results and picks the one likely to be the "cheapest" in terms of execution time and resource usage.

Hugo Kornelis, SQL Server/Data Platform MVP (2006-2016)
Visit my SQL Server blog: https://sqlserverfast.com/blog/
SQL Server Execution Plan Reference: https://sqlserverfast.com/epr/

SamaReddy SSC Veteran Points: 264 More actions · Answer 4

SamaReddy

SSC Veteran

Points: 264

September 6, 2010 at 1:41 am

#1216779

Thanks for the question!

tommyh SSCertifiable Points: 6302 More actions · Answer 5

Hugo Kornelis (9/6/2010)
Good question!
In addition to your explanation, here is WHY an alias can be used in the ORDER BY, but not in the GORUP BY:

Because Microsoft havent implemented it.

We are allowed to use aliases in Group by (and other stuff) in the form of table aliases. Therefore there is no real reason for not allowing column aliases as well (well maybe a bit of performance).

Koen Verbeeck SSC Guru Points: 259215 More actions · Answer 6

tommyh (9/6/2010)
Hugo Kornelis (9/6/2010)
Good question!
In addition to your explanation, here is WHY an alias can be used in the ORDER BY, but not in the GORUP BY:
Because Microsoft havent implemented it.
We are allowed to use aliases in Group by (and other stuff) in the form of table aliases. Therefore there is no real reason for not allowing column aliases as well (well maybe a bit of performance).

And who is we?

On topic: nice question about the basics. And thanks for the explanation in depth Hugo.

Need an answer? No, you need a question
My blog at https://sqlkover.com.
MCSE Business Intelligence - Microsoft Data Platform MVP

tommyh SSCertifiable Points: 6302 More actions · Answer 7

da-zero (9/6/2010)
tommyh (9/6/2010)
Hugo Kornelis (9/6/2010)
Good question!
In addition to your explanation, here is WHY an alias can be used in the ORDER BY, but not in the GORUP BY:
Because Microsoft havent implemented it.
We are allowed to use aliases in Group by (and other stuff) in the form of table aliases. Therefore there is no real reason for not allowing column aliases as well (well maybe a bit of performance).
And who is we?

Hrmm having a hard time deciding if that is supposed to be sarcastic or not. But anyway.

I used "we" to represent everyone using SQL. Honestly dont know how else i could have written that.

Hugo Kornelis SSC Guru Points: 64789 More actions · Answer 8

tommyh (9/6/2010)
Hugo Kornelis (9/6/2010)
Good question!
In addition to your explanation, here is WHY an alias can be used in the ORDER BY, but not in the GORUP BY:
Because Microsoft havent implemented it.

No, because Microsoft has chosen not to go against the standard.

From ISO/IEC 9075-2:2003 (E), 7.9 <group by clause>:

"Syntax Rules

1) Each <grouping column reference> shall unambiguously reference a column of the table resulting from

the <from clause>. A column referenced in a <group by clause> is a grouping column."

SQL Server does allow you to use expressions in the GROUP BY instead of only simple column references (which extends the standard but oes not violate it). Allowing column aliases would be a violation of the standard, and would break compatibility with ANSI-standard conformant products and code.

We are allowed to use aliases in Group by (and other stuff) in the form of table aliases. Therefore there is no real reason for not allowing column aliases as well (well maybe a bit of performance).

Of course we can use a table alias in the GROUP BY. That follows logically from the same logical query evaluation order described above. The FROM clause is evaluated first. Table aliases are assigned in the FROM clause. As a result, you can use them in alll other clauses (and, in fact, MUST use them - once you assign a table alias, you can no longer reference the table by its original table name).

Hugo Kornelis, SQL Server/Data Platform MVP (2006-2016)
Visit my SQL Server blog: https://sqlserverfast.com/blog/
SQL Server Execution Plan Reference: https://sqlserverfast.com/epr/

tommyh SSCertifiable Points: 6302 More actions · Answer 9

Hugo Kornelis (9/6/2010)
tommyh (9/6/2010)
Hugo Kornelis (9/6/2010)
Good question!
In addition to your explanation, here is WHY an alias can be used in the ORDER BY, but not in the GORUP BY:
Because Microsoft havent implemented it.
No, because Microsoft has chosen not to go against the standard.

I think we see "implemented" a bit differently. I see it as something not being in the software (regardless of reason). And as for their reason. Well maybe it was because they didnt want to go against the standard, maybe not. Microsoft aint that know to obay standards.

Open Minded SSCommitted Points: 1872 More actions · Answer 10

I got the answer right, after a long series of wrongs.

Hugo Kornelis (9/6/2010)
Good question!
1. FROM clause. Build an intermediate set as a copy of the single table in the FROM clause, or as the result of joining the multiple tables. Or supply a single-row result set with no columns if no FROM is specified.

Whoa!! Hugo, I always thought this behavior was from Visual Foxpro, to copy the entire table first before selecting the data.

I always thought in SQL, the data was Selected first before Where and Group, etc. Is

Hugo Kornelis SSC Guru Points: 64789 More actions · Answer 11

Open Minded (9/6/2010)
I got the answer right, after a long series of wrongs.
Hugo Kornelis (9/6/2010)
Good question!
1. FROM clause. Build an intermediate set as a copy of the single table in the FROM clause, or as the result of joining the multiple tables. Or supply a single-row result set with no columns if no FROM is specified.
Whoa!! Hugo, I always thought this behavior was from Visual Foxpro, to copy the entire table first before selecting the data.
I always thought in SQL, the data was Selected first before Where and Group, etc. Is

Just remember, the description I gave is for the logical processing order. Actual processing order is always different - the optimizer picks the "best" plan to get the results. Making a competely redundant copy of a table will never be included in that plan!

Hugo Kornelis, SQL Server/Data Platform MVP (2006-2016)
Visit my SQL Server blog: https://sqlserverfast.com/blog/
SQL Server Execution Plan Reference: https://sqlserverfast.com/epr/

Koen Verbeeck SSC Guru Points: 259215 More actions · Answer 12

tommyh (9/6/2010)
da-zero (9/6/2010)
And who is we?
Hrmm having a hard time deciding if that is supposed to be sarcastic or not. But anyway.
I used "we" to represent everyone using SQL. Honestly dont know how else i could have written that.

Whoops. My bad.

I didn't see you were talking about table aliases instead of column aliases. So I thought you were referring to some database (other than SQL Server) where column aliases in the group by are possible.

Guess I'm not fully awake yet...

Need an answer? No, you need a question
My blog at https://sqlkover.com.
MCSE Business Intelligence - Microsoft Data Platform MVP

AmarJhingan Newbie Points: 7 More actions · Answer 13

I parsed the query before running it.. and it gave me "Command(s) completed successfully.". Not sure why.. ?

tommyh SSCertifiable Points: 6302 More actions · Answer 14

amresh_jhingan (9/6/2010)
I parsed the query before running it.. and it gave me "Command(s) completed successfully.". Not sure why.. ?

Syntax check only. And the syntax is okay. But that doesnt mean that it will run.

You can for instance parse say "select something from nothing" without having a table called "nothing".