Fun with OVER

  • Comments posted to this topic are about the item Fun with OVER

    There are no facts, only interpretations.
    Friedrich Nietzsche

  • Did u really get the desired output from the SELECT statement you posted. I am getting the error when i executed those queries.

    When I replaced COUNT(*) with ROW_NUMBER() then A & B return the desired output.

    Sorry, when i run the statements in SQL Server 2012 I am able to get the desired output using B & C. Earlier I tried in SQL Server 2008 R2.

    Please ignore this!!

  • Very interesting question, thanks.

    Need an answer? No, you need a question
    My blog at https://sqlkover.com.
    MCSE Business Intelligence - Microsoft Data Platform MVP

  • Interesting, but when I change the order of the rows in the insert (Mary before Anita) the order of the rows in query C changes.

    The result of Query C depends on the initial order of the rows returned, which is not specified with an ORDER BY and can change.

  • Louis Hillebrand (6/26/2014)


    Interesting, but when I change the order of the rows in the insert (Mary before Anita) the order of the rows in query C changes.

    The result of Query C depends on the initial order of the rows returned, which is not specified with an ORDER BY and can change.

    +1

    I guess however the main aim of the exercise was checking on ROWS v RANGE and so trying to ensure we get 1, 3, 3 or 1, 2, 3 in the results - other than by time, I assume the order wasn't critical.

  • Nice question. Thanks for sharing

  • very interesting question.Thanks Barry.

    --rhythmk
    ------------------------------------------------------------------
    To post your question use below link

    https://www.sqlservercentral.com/articles/forum-etiquette-how-to-post-datacode-on-a-forum-to-get-the-best-help
    🙂

  • I also put "Just B" on the basis that ordering by name in query C isn't guaranteed.

  • This was removed by the editor as SPAM

  • Though QOTD mentions to use sql2012, I had to use only 2008 R2 and i tried with row_number.

    option A and B returned the same result in 2008 R2

    but for option C I read the same reference and concluded that C also returns in the same order as it is "rows unbounded preceeding" not the range

    Got into conclusion that all 3 queries will return the same.

    But I missed the default consideration "The default framing is RANGE rather than ROWS" 🙁

    My question is, OVER (default framing) will differ with count(*) or row_number() ?

    row_number -- any how indicates "row"

    but count(*) ?

  • Thank you for the post, good one. got it right, I was sure of 2nd but on third I was not sure and later kind of felt it might work, so selected b and c. 🙂 Learnt the new option C way as I have never used it.

    ww; Raghu
    --
    The first and the hardest SQL statement I have wrote- "select * from customers" - and I was happy and felt smart.

  • It took me a while to get my head around the difference between the windowing (PARTITION BY) and framing within the window (ORDER BY). The critical aspect of this question was understanding how ROWS and RANGE framing give different results when there are duplicate values produced by the ORDER BY. In the question the window is all three rows (PARTYNAME is the same for all three) so the function (COUNT) is applied over that set and the sequence of records is determined by the framing. In A the framing was left at default which is RANGE. RANGE framing goes by the value used and since both Anita and Mary have the same time they get the same count. B also uses FRAME but by changing the value of the frame (ARRIVALTIME and DINERFIRSTNAME together) every row has a unique value and is counted separately. In C we used ROWS framing and it calculates the function by POSITION rather than value. Since Anita and Mary have the same ARRIVALTIME, the sort order is not guaranteed. The most likely would be entry order but don't count on it.

    There are no facts, only interpretations.
    Friedrich Nietzsche

  • pmadhavapeddi22 (6/26/2014)


    Though QOTD mentions to use sql2012, I had to use only 2008 R2 and i tried with row_number.

    option A and B returned the same result in 2008 R2

    but for option C I read the same reference and concluded that C also returns in the same order as it is "rows unbounded preceeding" not the range

    Got into conclusion that all 3 queries will return the same.

    But I missed the default consideration "The default framing is RANGE rather than ROWS" 🙁

    My question is, OVER (default framing) will differ with count(*) or row_number() ?

    row_number -- any how indicates "row"

    but count(*) ?

    The reason you get different results with ROW_NUMBER is that you now have unique values in your frame. The trick in this question was that RANGE and ROWS give different results when values are equal. The COUNT was essentially saying in A "count where ARRIVALTIME is <= the current value". Anita and Mary both have the same time so their counts would be equal. B forced every value to be unique by using both time and first name together. C used ROWS ranging which is based on position in the window rather than value. As pointed out by another poster, you can't guarantee the order of the names because the ORDER BY only looked at ARRIVALTIME.

    There are no facts, only interpretations.
    Friedrich Nietzsche

  • Thank you for the question; it's a good one, but slightly flawed.

    As others have pointed out, the results for query C are not guaranteed, but it is not just the order of the rows, but the value in the NumberArrived column that may be different. In the example, query C produces the correct results, but only because Anita was inserted before Mary. If you change the order of the data rows in the INSERT statement to insert Jean, then Mary, then Anita, then Query C will produce Jean - 1, Mary - 2, and Anita - 3, which does not match the desired output. Thus, query C produces the correct results sometimes, but not all of the time, while query B is accurate all of the time. Therefore, the correct answer should have been only query B.

  • Also, can any explain... in query A, when Jean has 1 and Anita and Mary has 3 where did the 2 go?

    ww; Raghu
    --
    The first and the hardest SQL statement I have wrote- "select * from customers" - and I was happy and felt smart.

Viewing 15 posts - 1 through 15 (of 25 total)

You must be logged in to reply to this topic. Login to reply