Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 
        
Home       Members    Calendar    Who's On


Add to briefcase 12345»»»

Solving the "Running Total" & "Ordinal Rank" Problems in SS 2k/2k5 Expand / Collapse
Author
Message
Posted Wednesday, January 30, 2008 10:21 PM


SSC-Dedicated

SSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-Dedicated

Group: General Forum Members
Last Login: Yesterday @ 11:30 PM
Points: 36,706, Visits: 31,156
Comments posted to this topic are about the item

Solving the "Running Total" & "Ordinal Rank" Problems in SS 2k/2k5


The new article was published on 10 Nov 2009 and may be found at the following URL...

http://www.sqlservercentral.com/articles/T-SQL/68467/

The new article is basically a rewrite of the old article with some fixes and workarounds for some of the things we've encountered in the last year. A second article is in the works for some of the many uses for the "Quirky Update"... there was just too much in the first article to conveniently fit.


--Jeff Moden
"RBAR is pronounced "ree-bar" and is a "Modenism" for "Row-By-Agonizing-Row".

First step towards the paradigm shift of writing Set Based code:
Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column."

(play on words) "Just because you CAN do something in T-SQL, doesn't mean you SHOULDN'T." --22 Aug 2013

Helpful Links:
How to post code problems
How to post performance problems
Post #449802
Posted Wednesday, January 30, 2008 10:25 PM


Grasshopper

GrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopper

Group: General Forum Members
Last Login: Sunday, February 21, 2010 11:12 PM
Points: 13, Visits: 43
Watch out with those clustered indices...the order is NOT guaranteed to be the index order.
There are a few cases - out of disk space is one I experienced, parallel query execution et al - that cause the leaf data to be out of order.
See also http://blog.sqlauthority.com/2007/09/24/sql-server-order-of-result-set-of-select-statement-on-clustered-indexed-table-when-order-by-is-not-used/



Regards
Andy Davies

Post #449803
Posted Wednesday, January 30, 2008 10:41 PM


SSC-Dedicated

SSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-Dedicated

Group: General Forum Members
Last Login: Yesterday @ 11:30 PM
Points: 36,706, Visits: 31,156
Yep... thanks for the feedback. I've seen that, too. But only on SELECTs... and only on non-clustered indexes. I've never seen it fail on UPDATEs on CLUSTERED index hints. That's why I did all the proofing in the article with the Merry-Go-Round index.

--Jeff Moden
"RBAR is pronounced "ree-bar" and is a "Modenism" for "Row-By-Agonizing-Row".

First step towards the paradigm shift of writing Set Based code:
Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column."

(play on words) "Just because you CAN do something in T-SQL, doesn't mean you SHOULDN'T." --22 Aug 2013

Helpful Links:
How to post code problems
How to post performance problems
Post #449807
Posted Thursday, January 31, 2008 1:06 AM


Mr or Mrs. 500

Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500

Group: General Forum Members
Last Login: Today @ 2:46 AM
Points: 575, Visits: 2,500
Yes, I'd definitely agree that the 'Quirky update' approach is the right one. I've always used an 'order by' but never tried clustered index/hints to force the order. It seems pretty robust too. I've used it in Sybase and, I reckon, all versions of SQL Server in the past.

The strange thing about the 'Quirky Update' technique is that, one has experimented with it, one discovers all sorts of problems that it solves. Directory trees, Book indices (Index, Use of, Updates), financial reporting. Whoever put the feature into Sybase was very far-sighted.



Best wishes,

Phil Factor
Simple Talk
Post #449834
Posted Thursday, January 31, 2008 1:48 AM
SSCrazy

SSCrazySSCrazySSCrazySSCrazySSCrazySSCrazySSCrazySSCrazy

Group: General Forum Members
Last Login: Today @ 1:45 AM
Points: 2,826, Visits: 3,866
Hello Jeff,

very interesting article.
Somehow it throws all my beliefs into the trash bin:
1. The update method with variables is not explicitly supported by Microsoft
2. The clustered index does not guarantee ordered results
3. Order By in derived tables does not guarantee ordered results.
Your "evidence" does not really convince me. It could be just a lucky day (or a lot of them) ;).

If anyone would have some "official" documentation on these 3 topics, that would be great!

Btw: Did anyone try the "Order By" Solution?
It returns wrong results for me, but maybe did not copy the code correctly (somehow the linefeeds
disappear with c/p).


Best Regards,
Chris Büttner
Post #449839
Posted Thursday, January 31, 2008 3:40 AM
Old Hand

Old HandOld HandOld HandOld HandOld HandOld HandOld HandOld Hand

Group: General Forum Members
Last Login: Friday, May 16, 2014 4:12 AM
Points: 376, Visits: 148
I think Andy has a point there. On multiprocessor machine SQL Server might use more threads to scan the index and then merge the result sets in one data stream before executing the update, and this could mess up things. I'd add a maxdop 1 option just to be sure. Even with one thread, I'm not sure that SQL Server would start scanning from the beginning of the index if some data pages are already in memory (perhaps because someone has just finished a full scan of the clustered index). We should use an order by to be sure, but it's not allowed on updates. Here's a trick I use in situations like this: I put a 'greater than' condition on the clustered index column(s) (it works on single column keys, it should work on multicolumn key if we use the first column of the index), where the comparison value is less than any value of the index column (say min(IndexColumnName)-1). This forces the engine to traverse the index looking for the first record matching the condition to start the scan from there. You could also avoid the index hint, unless that column also belongs to other indexes.
Interesting, these are the execution plans without and with the ordering condition AccountID>0:

UPDATE dbo.JBMTest
SET @PrevRunBal = RunBal = @PrevRunBal + Amount,
@PrevAcctID = AccountID
FROM dbo.JBMTest WITH (INDEX(IX_JBMTest_AccountID_Date),TABLOCKX)
OPTION(MAXDOP 1)

StmtText -------------------------------------------------------------------------------------------------------------------------------------------------------------
|--Clustered Index Update(OBJECT: ([Northwind].[dbo].[JBMTest].[IX_JBMTest_AccountID_Date]), SET: ([@PrevAcctID]=[Expr1005], [JBMTest].[RunBal]=[Expr1004]))
|--Compute Scalar(DEFINE: ([Expr1004]=[@PrevRunBal]=[@PrevRunBal]+[JBMTest].[Amount], [Expr1005]=[@PrevAcctID]=[JBMTest].[AccountID]))
|--Top(ROWCOUNT est 0)
|--Clustered Index Scan(OBJECT: ([Northwind].[dbo].[JBMTest].[IX_JBMTest_AccountID_Date]))

UPDATE dbo.JBMTest
SET @PrevRunBal = RunBal = @PrevRunBal + Amount,
@PrevAcctID = AccountID
FROM dbo.JBMTest WITH (INDEX(IX_JBMTest_AccountID_Date),TABLOCKX)
WHERE AccountID>0
OPTION(MAXDOP 1)

StmtText --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
|--Clustered Index Update(OBJECT: ([Northwind].[dbo].[JBMTest].[IX_JBMTest_AccountID_Date]), SET: ([@PrevAcctID]=[Expr1005], [JBMTest].[RunBal]=[Expr1004]), DEFINE: ([Expr1004]=[@PrevRunBal]=[@PrevRunBal]+[JBMTest].[Amount], [Expr1005]=[@PrevAcctID]=[JBMTest].[AccountID]), WHERE: ([JBMTest].[AccountID] > 0))


Quite different, but with the same results in terms of I/O statistics:

Table 'JBMTest'. Scan count 1, logical reads 10186, physical reads 0, read-ahead reads 50.

Ok, time to go back to work.
bye



Salvor
Post #449851
Posted Thursday, January 31, 2008 3:59 AM


SSCertifiable

SSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiable

Group: General Forum Members
Last Login: Today @ 1:27 AM
Points: 7,001, Visits: 8,438
Very good article, nice testcode.


There may be some testing needed regarding parallelism,
maybe for now just add a OPTION(MAXDOP 1).


The only remarks I would add to the conclusion is :

- Keep in mind that by design there is no order in a setbased approach

- For now it works and it is the fasted propriatary solution for the problem.



Remeber the views with order by clause that suddenly nolonger
externalized the order in SQL2005, but worked so well in SQL2000.


Thank you for sharing the knowledge.


Johan


Don't drive faster than your guardian angel can fly ...
but keeping both feet on the ground won't get you anywhere

- How to post Performance Problems
- How to post data/code to get the best help


- How to prevent a sore throat after hours of presenting ppt ?


"press F1 for solution", "press shift+F1 for urgent solution"


Need a bit of Powershell? How about this

Who am I ? Sometimes this is me but most of the time this is me
Post #449858
Posted Thursday, January 31, 2008 4:33 AM
Ten Centuries

Ten CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen Centuries

Group: General Forum Members
Last Login: Thursday, January 30, 2014 10:08 PM
Points: 1,038, Visits: 444
I have to agree with the naysayers unfortunately. It's great at the moment and really does seem to work (I haven't seen it fail yet either). I was always suspicious and slightly annoyed when people used the select top 100 percent with an order by in a view - dirty hack in my opinion. This could go the same way. Having said that, perhaps we could get someone from the MS Query Engine team to elaborate on
* The update syntax of @variable = col = newValue
* What happens if you have multiple of these in the same update. eg
@variable = col = col + @variable
col2 = col + @variable
Does col2 effectively get col + col + @variable or just col + @variable?
* Forced ordering.

I'm going to run some quick tests on my sql 2k5 installation to see if there are any quirks (I'd be very surprised if I found something that was overlooked by the likes of Jeff though :P )



Post #449870
Posted Thursday, January 31, 2008 5:17 AM


Mr or Mrs. 500

Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500

Group: General Forum Members
Last Login: Today @ 2:46 AM
Points: 575, Visits: 2,500
Christian,
The use of variables in Update statements has always been supported by Microsoft. The full syntax for the original Transact SQL Update statement is...

update [[database.]owner.]{table_name | view_name}
set [[[database.]owner.]{table_name. | view_name.}]
column_name1 =
{expression1 | null | (select_statement)} |
variable_name1 =
{expression1 | null | (select_statement)}
[, column_name2 = {expression2 | null |
(select_statement)}]... |
variable_name2 = {expression1 | null | (select_statement)}
[from [[database.]owner.] {table_name | view_name}
[, [[database.]owner.] {table_name |
view_name}]]...
[where search_conditions]

and the classic example that Sybase gives for assigning variables in an update statement is....

DECLARE @price money
SELECT @price = 0
UPDATE titles
    
SET total_sales = total_sales + 1,
    
@price = price
    
WHERE title_id = 'BU1032'
SELECT @price, total_sales
    
FROM titles
    
WHERE title_id = 'BU1032'
                          
total_sales
------------------------ -----------
                    
19.99        4096




It all seems to be well documented, so I reckon it is safe to use!



Best wishes,

Phil Factor
Simple Talk
Post #449876
Posted Thursday, January 31, 2008 5:38 AM
SSCrazy

SSCrazySSCrazySSCrazySSCrazySSCrazySSCrazySSCrazySSCrazy

Group: General Forum Members
Last Login: Today @ 1:45 AM
Points: 2,826, Visits: 3,866
Hi Phil,

you are correct, it obviously is supported.
I misunderstood a section in a book [1] that covered this type of statement.
In this book it was only mentioned that the variable thing is "far from standard",
but obviously not "unsupported".

Thanks for your hint!

[1] Inside Microsoft SQL Server 2005: T-SQL Querying


Best Regards,
Chris Büttner
Post #449884
« Prev Topic | Next Topic »

Add to briefcase 12345»»»

Permissions Expand / Collapse