WHERE vs. GROUP BY again

  • The following two T-SQL statements return exactly the same row set whatever data is:

    [font="Courier New"]SELECT something FROM somewhere WHERE condition GROUP BY grouping[/font]

    and

    [font="Courier New"]SELECT something FROM somewhere GROUP BY grouping HAVING condition[/font]

    provided that

    *) all italic placeholders are replaced with corresponding literally identical strings;

    *) both statements are syntactically correct (condition contains no aggregate functions).

    I think the above assertion is correct. Am I right please?

    Thanks.

  • (condition contains no aggregate functions)

    This is very important. Since you specify you can't filter on aggregate values, the HAVING clause acts the same as the WHERE clause. In other words , you have written two identical SELECT statements.

    However, if you can filter on aggregate values in the HAVING clause, there are surely not the same.

    Need an answer? No, you need a question
    My blog at https://sqlkover.com.
    MCSE Business Intelligence - Microsoft Data Platform MVP

  • Thank you, Koen. I parenthesized the quoted phrase because I deem it's an implication of "both statements are syntactically correct". Isn't it?

  • Igor Makedon (1/28/2013)


    Thank you, Koen. I parenthesized the quoted phrase because I deem it's an implication of "both statements are syntactically correct". Isn't it?

    I wouldn't use "correct". If you change the HAVING clause to filter on aggregated values, the statement is still syntactically correct, however it is not the same as the other one.

    Need an answer? No, you need a question
    My blog at https://sqlkover.com.
    MCSE Business Intelligence - Microsoft Data Platform MVP

  • The condition in a HAVING clause is different than the condition on a WHERE clause as the HAVING condition acts on the result of the aggregation.

    Consider the condition Count(something) > 2

    This as a HAVING condition would return a data set when there were more than 2 somethings existed. It is not possible to replicate this condition with the WHERE clause as each individual data row is 'unaware' of other rows with the same value.

    to put the answer the other way round.

    You can use the WHERE condition in a HAVING condition - but is a misuse of HAVING

    You can not always put the HAVING condition in the WHERE condition - If you can, my suggestion is that you do.

  • I wouldn't use "correct".

    Me neither, but in the next sentence you are using exactly wording of mine.

    And besides, could you please offer your Yes/No answer to my question I put in my original message?

  • Consider the condition Count(something) > 2

    Richard, please heed my note about syntactically correct and aggregate functions.

  • Excluding the aggregates then yes your assertion is correct and would produce the same results.

    But if I were an examiner in a test then I would say that by using the condition in the HAVING clause, that you do not understand the nature of the HAVING clause which is to act on aggregates.

    The processes acting within the server would produce the same data set, but would process the original data in different ways. Not sure why you asked the question.

  • I need to optimize some non-SQL code that composes SQL statements dynamically. Some pieces of code look very much alike, but put the a priori non-aggregate condition either in HAVING or in WHERE clauses seemingly at random (different programmers at different times). To incorporate code into a single routine and thus to throw away costs of past development inconsistencies, is my goal.

  • WHERE and HAVING are fundamentally different; they should never be thought of as equivalent or interchangeable. That is, for a given query/subquery, only one of them is right to use.

    Use WHERE when you want to filter individual row values.

    Use HAVING when you want to filter based on aggregate/group values.

    SQL Server used to allow non-aggregate column names in the HAVING clause, and indeed then did not evaluate that condition until after GROUPing had been done; from SQL 2005 on, I don't SQL even allows that any more.

    SQL DBA,SQL Server MVP(07, 08, 09) A socialist is someone who will give you the shirt off *someone else's* back.

  • Sorry...

  • Fine!

    Are the folowing two SELECT statements equvalent (i.e. -- see my original question, -- "return exactly the same row set" no matter "what data is")?

    [font="Courier New"]SELECT xx FROM yy WHERE xx >=5 GROUP BY xx

    SELECT xx FROM yy GROUP BY xx HAVING xx >=5[/font]

  • Igor Makedon (1/28/2013)


    Fine!

    Are the folowing two SELECT statements equvalent (i.e. -- see my original question, -- "return exactly the same row set" no matter "what data is")?

    [font="Courier New"]SELECT xx FROM yy WHERE xx >=5 GROUP BY xx

    SELECT xx FROM yy GROUP BY xx HAVING xx >=5[/font]

    NO, they are not equivalent, atlhough they may return the same result set.

    Since xx is a non-aggregate column, "WHERE xx ..." is correct.

    In some SQLs, SQL will not check xx with HAVING until after the GROUPing is done. SQL Server now effectively rewrites the second query to match the first. But it's wrong to rely on that. I would suspect the second query is not even guaranteed to be accepted by SQL itself as a valid query -- might depend on the SQL engine and/or version.

    SQL DBA,SQL Server MVP(07, 08, 09) A socialist is someone who will give you the shirt off *someone else's* back.

  • My question is: just "may" [sometimes]? or "always [any data] do"?

Viewing 14 posts - 1 through 13 (of 13 total)

You must be logged in to reply to this topic. Login to reply