|
|
|
SSC Journeyman
      
Group: General Forum Members
Last Login: Friday, May 17, 2013 3:15 PM
Points: 85,
Visits: 607
|
|
Very interesting article.
I want to add my voice to those urging caution with regard to relying on order being forced by a clustered index. According to the SQL standard, the only way to enforce order is with ORDER BY.
The problem with relying on a clustered index is that it isn't supported: no matter how many times you have tested it, it is not guaranteed to work, and data corruption introduced by a few pages not being in the order in which you expect them is likely to be both subtle and really difficult to fix. I can only imagine trying to diagnose the problem... it makes me queasy just thinking about it.
I will consider using the ORDER BY method, which seems to me to be innovative and safe code.
|
|
|
|
|
SSCertifiable
       
Group: General Forum Members
Last Login: Yesterday @ 2:17 AM
Points: 6,862,
Visits: 8,049
|
|
|
|
|
|
Ten Centuries
      
Group: General Forum Members
Last Login: Yesterday @ 2:52 PM
Points: 1,322,
Visits: 1,071
|
|
Did you test using the Clustered Index with the ORDER BY subquery? It seems to me that the ORDER BY Would enforce the order, but since you are using a Clustered Index it wouldn't cause much overhead unless you were going to get bit by a disordered resultset anyway. -- JimFive
|
|
|
|
|
SSC Rookie
      
Group: General Forum Members
Last Login: Tuesday, March 05, 2013 10:18 AM
Points: 47,
Visits: 84
|
|
Since you installed SQL 2K5, have you looked into the "OVER" clause (BOL reference: ms-help://MS.SQLCC.v9/MS.SQLSVR.v9.en/tsqlref9/html/ddcef3a6-0341-43e0-ae73-630484b7b398.htm) and the Ranking Functions (BOL reference: ms-help://MS.SQLCC.v9/MS.SQLSVR.v9.en/tsqlref9/html/e7f917ba-bf4a-4fe0-b342-a91bcf88a71b.htm)?
I am curious how they perform against your test. I have used the Ranking Functions, but only on result sets of a few thousand rows and it has performed well for me.
Darrin
|
|
|
|
|
SSCertifiable
       
Group: General Forum Members
Last Login: Yesterday @ 7:57 PM
Points: 6,998,
Visits: 13,949
|
|
good one - no doubt about it. I hadn't yet seen the merry-go-round in action, but it does make sense now.
From what I've seem, the OVER syntax, and the ranking functions give you no advantage whatsoever on anything except for counts in this particular case. It's in my mind one of the failings/shortcoming of that syntax, since with all it does, you STILL have to do triangle joins or this kind of technique to get any kind of running aggregate other than a count (which you get implicitly through the ROW_NUMBER() function). I am pretty sure that's part of the OVER syntax from Ansi that didn't make the 2005 implementation (as I recall, that OTHER database product does running aggregates using its OVER syntax)..
And - just like you'd expect - using a CTE to try to force the order does nothing for you. It operates the same as a view does (inner ORDER BY is ignored).
---------------------------------------------------------------------------------- Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?
|
|
|
|
|
Ten Centuries
      
Group: General Forum Members
Last Login: 2 days ago @ 7:15 AM
Points: 1,352,
Visits: 1,738
|
|
Jeff, Thanks for the great examples on generating running totals. Question: You mentioned that you have to include the "anchor" (or dummy) in the update statement in order for it to work correctly, but you didn't explain why. I can see its use in the group total, group rank, but could you explain why it is needed in the update statement for the overall running total? (aside from "just because it doesn't work without it")
|
|
|
|
|
SSC Rookie
      
Group: General Forum Members
Last Login: Monday, April 29, 2013 10:28 AM
Points: 46,
Visits: 262
|
|
Jeff, thanks for this article I learn something new. I was relying on Reporting Service reports, cubes and MDX to do the running total for me. This article shows me a way to do it in classic reports. I wonder if the same type of technique could be used as a general way to get a rid of all cursors and while loops. I stay away from but sometimes have no alternative. I have a specific problem now. I must create a table to report on contract compliance. I can have multiple contracts for the same customer with or without the same products in them. I can have different dates of start and finish. Dates can overlap. The revenue table does not have any contract info. To merge contract info with revenue data, I must use while loops and for each row, I put the contract num, the customer, the product, the start and finish date into variables, then in a while loop I add row by row the combined data in the result table by merging the Contract num with the matching data found in the revenue table. When done one can look at performance on a contract by contract basis. The sum of all contracts for a given customer product and period could double count revenue. This is understood and acceptable. I have 11 different type of contracts, the level of grouping changes and the nature of the revenue changes as well, can be direct, indirect and has to look at both revenue and orders. Takes 5 hours to run on a 64 bits machine, once a week on Saturday. These tables have million of rows, it is working but if it could work in a fraction of the time it would be better.
BI Guy
|
|
|
|
|
SSC-Dedicated
           
Group: General Forum Members
Last Login: Today @ 12:47 PM
Points: 37,730,
Visits: 29,992
|
|
Very nice, and thanks for the link.
Gail Shaw Microsoft Certified Master: SQL Server 2008, MVP SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
We walk in the dark places no others will enter We stand on the bridge and no one may pass
|
|
|
|
|
SSCertifiable
       
Group: General Forum Members
Last Login: Yesterday @ 7:57 PM
Points: 6,998,
Visits: 13,949
|
|
Philippe -
While I would recommend you start a separate thread on the specifics of your process, I would imagine that this technique might be useable in your scenario. Hard to tell for sure, but it's certainly one way to step away from some loops and cursors (although to be fair - the MS gear-heads that blog on this call this technique the "inner loop" process, since you're flying through each row in sequence, albeit VERY quickly).
Jeff is probably going to give me the evil eye on that one....:)
---------------------------------------------------------------------------------- Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?
|
|
|
|
|
Ten Centuries
      
Group: General Forum Members
Last Login: Sunday, September 16, 2012 3:26 AM
Points: 1,038,
Visits: 443
|
|
Unfortunately SQL 2k5's over clause is missing that little bit of magic present in the SQL standard and other SQL products. The full functionality is also not present in 2008 but perhaps the next version (2010/2011?) might have it. See http://www.sqlmag.com/Articles/ArticleID/95007/95007.html?Ad=1. In the article you can vote to have the extra features added and there's a link to a well written MS Word document in the article. In fact it serves as a great introduction by itself to the OVER clause and what's possible both now and, hopefully, in the future. Direct link: http://www.insidetsql.com/OVER_Clause_and_Ordered_Calculations.doc :D Courtesy goes to Itzik Ben-Gan and Sujata Mehta for the article/paper - I'm just linking to it.
|
|
|
|