ORDER BY in the OVER clause

  • I am using the AVG( ) with the OVER clause and within the OVER clause I have a question about ORDER BY.  In the attached Excel file, I have two examples, one on each tab.  Both tabs show table in the FROM clause in teal green (columns A:F).  Also on both tabs, the result of my query is in yellow.

    For example 1A, I am doing ORDER BY Sales, then taking the AVG(Sales).  In this case, each unique grouping of values under Sales is treated like a row.  For example, F7 would be on one row, then F8:F10 is recognized as another row since those 3 cells contain the same value.  Furthermore, in the results table (yellow table) under M8:M10, the value 211,976.00 appears, which makes sense if F8:F10 is being recognized as one row.  So far no problem.

    For example 1D, I am also doing ORDER BY Sales.  In this query, my window for the AVG( ) is defined as starting at 2 rows before the current row and ending at 1 row following the current row.  In this example, each grouping of unique values under Sales is not being recognized as a "row" like example 1A.  Instead, each row is recognized as a row.

    My question is this.  Why is it, that each grouping of unique values under Sales is recognized as a row in example 1A, but not in example 1D.

    Attachments:
    You must be logged in to view attached files.
  • As much as I love Excel, I'm not going to go through it to generate DDL and sample data.  Please provide both and we can help you.

    For better, quicker answers, click on the following...
    http://www.sqlservercentral.com/articles/Best+Practices/61537/

    For better answers on performance questions, click on the following...
    http://www.sqlservercentral.com/articles/SQLServerCentral/66909/

  • Certain windowed functions that use an ORDER BY clause require a frame.  If you do not provide a frame it uses the default RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW.  When you use range, all tied rows are included in the range; when you use rows, ties are only included if they fall in the window based on the sort.  Using range is deterministic, but using rows is not necessarily deterministic, which is why range is the default.

    Drew

    J. Drew Allen
    Business Intelligence Analyst
    Philadelphia, PA

  • I have a screenshot of the SQL code pasted into the Excel file.  Did you see the screenshot?

    For the sample data, that is shown in the first table on both tabs with a teal green heading.

  • I forgot to take that into account.  That makes sense now.

  • I have another problem that might be similar.  In the attached file to this post, the tab called Example 1G shows the query I used above the yellow columns.  The data in the FROM table appears in the teal green columns.  I can make sense out of the calculations for cells M7:M12, however I run into a problem for cell M13.

    In cells F11:F14, I have my "window" outlined, but the average calculated doesn't equal M13, so I must have the wrong window.

    When row 13 is processed, what is the correct window?  That is probably the best way to frame my question and will probably enable me to understand what I am doing wrong.

    Attachments:
    You must be logged in to view attached files.
  • Michael.leach2015 - you should read Mike01's link.  A lot of people (myself included) do not like downloading random files off of a forum to try to help people.

    Drew.Allen did a good job explaining how the windowing function works.

    but I bit the bullet and download it.  When I look at it, it looks like is working IF I understand things correctly.  Column M is the results of your SQL query, correct?  If so, then everything looks right to me.  If you take F11 through F14 and sum them up you get M13 BUT only if you use the values in the YELLOW section.  What you will notice is that the Green section, row 14 does not match the yellow section row 14.  SQL changed the order on you as you didn't tell it how it should be ordered EXCEPT by region.  So some of the ordering got bumped around.

    If you take the average of K11:K14 you get M13.  Everything in A through F is useless as SQL changed the order since you didn't tell it how it should order it.

    That being said, I see no logic to the ordering in your excel file.  For example, what order SHOULD these values be in:

    LETTER, NUMBER

    A,                5

    A,                3

    A,                7

    A,                1

    When you say "order by LETTER"?

    Again though, post DDL as Mike01 suggested or people likely won't help you.

    The above is all just my opinion on what you should do. 
    As with all advice you find on a random internet forum - you shouldn't blindly follow it.  Always test on a test server to see if there is negative side effects before making changes to live!
    I recommend you NEVER run "random code" you found online on any system you care about UNLESS you understand and can verify the code OR you don't care if the code trashes your system.

  • In looking back I guess there wasn't really a need to post the original data in the blue/green columns.  I understand it now.

    Also, I'll keep in mind the suggestion about posting a message.

    Thank you.

Viewing 9 posts - 1 through 8 (of 8 total)

You must be logged in to reply to this topic. Login to reply