Solving the Running Total and Ordinal Rank Problems (Rewritten)

* NOTICE * 24 Feb 2011 - This article is in the process of (hopefully) the final "rewrite". * NOTICE *

In the meantime, you'll be very interested to know that Paul White and Tom Thompson have a method of error checking that not only checks for errors virtually without any additional overhead, the method they came up with also helps further guarantee the correct sort order for the updates. You can find an example of the coded method at the following URL which will lead to you a post very late in this thread...

http://www.sqlservercentral.com/Forums/FindPost981258.aspx

The article below in the "first rewrite" and will be available until I can complete the "final rewrite".

For those that remember the first article:

I rewrote this article (which uses a method called the "Quirky Update") because I made a mistake in the first one. The mistake was in the "ORDER BY" example I used... I forgot to drop the clustered index on the test table and that made the example only look like the ORDER BY worked. In reality, it was the "Quirky Update" that made the ORDER BY appear to work. I had them take the original article down until I could rewrite it.

I'd also been using the INDEX(0) hint to force the use of the clustered index in an attempt to quell the fears that some folks may have in using the method. Judging from the discussions on several other forums about the method, the use of the index hint hasn't quelled any fears at all. Ironically, that's a good thing because all it ever did was slow things down... a lot. It's actually not needed and won't be included in any of the examples in this article.

Several other things came to the surface as a result of the discussions on the previous article. Partitioned tables and views created a problem preventing the "Quirky Update" from operating in the correct order because the logical split on the underlying tables is contrary to the clustered index on each table (see Lynn Pettis' article on the subject at http://www.sqlservercentral.com/articles/T-SQL/65522/ ). Another problem is that if parallelism ever took place, the update could work in parallel and that destroys the serial nature of the problem. Those types of things caused the need for either additional rules or additional controls in the code and are included in this rewrite.

For convenience in the discussion link, this article has been republished as if it were a new article. The previous article elicited 251 comments and it would be difficult for new readers to find new comments about the new article. The comments for the old article can be found by visiting the following URL.

http://www.sqlservercentral.com/Forums/Topic449802-203-1.aspx

Why am I spending so much time on such a method? Because if you need to do some previous row calculations, the method in this article can be used to solve a huge number of "procedural" and "previous row" problems at speeds that rival properly written CLR's without the headaches they can sometimes cause.

Last but not least, I apologize for the length of this article. There are a lot of proofs and test problems concerning the "Quirky Update" method and I just didn't want to split them up into two articles. For your convenience, I've included all the code as a single file in the "Resources" section near the end of this article.

I hope you enjoy the rewrite as much as I did writing it.

Introduction:

There are many tasks that we'd like to do in SQL Server that require our code to "remember" the calculated or given values from "previous rows". Such tasks include things like calculating running totals, ranking of rows, creating sequences, creating numbered groupings, counting how many times some rows are duplicated in groups, and, in some instances, copying data from the "previous row" to the "current" row for reporting and other purposes.

Starting in SQL Server 2005, many of those tasks have been become virtual child's play thanks to the advent of the "windowing functions" such as ROW_NUMBER(), RANK(), and others. Unfortunately, there are a lot of folks out there that need to do previous row calculations in SQL Server 2000 and earlier. To add to that, there are some things that even SQL Server 2005 and 2008 have no function for. For example, no version of SQL Server currently has a function to calculate running totals, nor a function to directly copy data from the previous row, nor a function to do groupings on repeated data such as "Start/Stop" indications over time (although there is a clever way to solve that which will be covered in a separate article in the future).

There are many methods to overcome those shortfalls and successfully produce the desired result sets. Obviously, Cursors and While Loops will solve all of these types of "procedural" problems but they also have some pretty severe performance and resource usage problems. There are also a handful of methods to solve procedural problems that look "Set Based" but aren't. Some of those supposedly set based solutions are very simple to write and provide seemingly very good performance but only when the row counts are small. As row counts increase, many of those common methods will absolutely crush server performance by driving CPU and I/O usage into the wall because they can use millions of times more resources than a Cursor or While Loop. Those same methods are also largely responsible for the appearance of the myth that "Set Based" code is slower than a Cursor.

The purpose of this article is to demonstrate a very old, extremely radical, very high performance method for accomplishing some of those "previous row" tasks including Grouped Running Totals and Grouped Running Counts (a form of ranking) in SQL Server 2000 and 2005. The method should also work just fine in SQL Server 2008 but I've not tested it there, yet. It should also work without modification in versions prior to SQL Server 2000 but, again, I've not tested it on those versions although many claim that it has worked correctly since the early days of SyBase.

Other "previous row" tasks will be covered in a separate article because, heh... this one is quite long as it is.

The performance of the method will be measured and compared to procedural methods by running the code against my old friend, the "Million Row Test Table". We'll also computationally verify that all of the methods worked correctly because a million rows just takes too long to check by eye.

We'll also learn a couple of things about indexes and an undocumented DBCC command on the way. A full description of how the code to create the million row test table in a high speed fashion is well beyond the scope of this article but, if you study the code, you'll find some pretty nice tricks. They're not new tricks but a lot of people still don't know about them. Perhaps a future article will cover those.

First, a word for the concerned...

Warning:

Well, sort of. Lots of folks (including some of the "big" names in the SQL world) warn against and, sometimes, outright condemn the method contained in this article as "unreliable" & "unsupported". One fellow MVP even called it an "undocumented hack" on the fairly recent "24 hours of SQL". Even the very core of the method, the ability to update a variable from row to row, has been cursed in a similar fashion. Worse yet, except for the ability to do 3 part updates (SET @variable = columnname = expression) and to update both variables and columns at the same time, there is absolutely no Microsoft documentation to support the use of this method in any way, shape, or form. In fact, even Microsoft has stated that there is no guarantee that this method will work correctly all the time.

Now, let me tell you that, except for one thing, that's ALL true. The one thing that isn't true is its alleged unreliability. That's part of the goal of the article... to prove its reliability (which really can't be done unless you use it. It's like proving the reliability of the SELECT statement). At the end of the article, make up your own mind. If you decide that you don't want to use such a very old ,yet, undocumented feature, then use a Cursor or While loop or maybe even a CLR because all of the other methods are just too darned slow. Heh... just stop telling me that it's an undocumented hack... I already know that and, now, so do you. 😉

Let's get started...

What is a "Running Total"?

In order to have an appreciation for the method we're going to talk about in this article, you must first have an appreciation for what a "previous row" problem is. There are many previous row problems but, to be absolutely clear, let's talk just a bit about what a "running total" is.

The classic "Running Total" or "Running Balance" problem is one of the more difficult calculations to pull off successfully in SQL Server because SQL Server doesn't work well (fast) with procedural code and there is currently no function in any version of SQL Server to accomplish the running total in a true set based fashion. What makes the problem so difficult is that a "balance" on any given row must contain the sum of all previous rows before that balance plus the value of the current row. Adding to the complexity of the problem, it must all be done in temporal or other specific order in order for the running total to make any sense. Obviously, the best thing to do would be to keep the Balance up to date every time a transaction is entered, but that's not always possible especially if you get a batch of multiple transactions.

A "Running Total" can most easily be described by the simple understanding of how things are done in an American check book. If you have a "Starting Balance" of $100 followed by 3 check entries (all subtracted) and a deposit (added), you'll end up with a table that looks something like this...

TransactionID Date       Amount Balance  Memo
------------- ---------- ------ -------  ----------------------
    1         01/01/2000   100     100   Initial Deposit
    2         01/01/2000   -30      70   Check #1
    3         01/01/2000   -10      60   Check #2
    4         01/02/2000   -20      40   Check #3
    5         01/02/2000    50      90   Deposit

Figure 1 - Example Checkbook Table

As each transaction is entered into the checkbook, the current balance is calculated by adding the current transaction to the previous transaction's balance. This balance is known as a "Running Total" or "Running Balance" and each row has the balance up to and including that row. The last row always contains what is known as the "Current Balance" or how much the account is currently worth.

Note that adding a negative number is the same as subtracting. Many accounting systems use only positive numbers and have some sort of indication as to whether a particular transaction is a credit (added) or a debit (subtracted). To keep things simple in this article, we'll simply use signed amounts that are either positive or negative as we did in Figure 1.

In a table of such transactions, like the checkbook example in Figure 1, it IS possible to have identical whole dates and maybe even identical date/times if times are included in the Date column. As a result, inclusion of a TransactionID column as an IDENTITY column will act as a "sort tie-breaker" when it comes to sorting the transactions in the order they were received/entered into the system. Getting the order right IS the very essence of a correct running balance to prevent negative balances from appearing where they shouldn't be which would be responsible for some nasty overdraft fees.

SQL Server 2005 introduced SUM() OVER as a windowing function similar to the ROW_NUMBER() OVER and RANK() OVER windowing functions, but SUM() OVER doesn't work (at least not in T-SQL, yet) like either of those and cannot (at the time of this writing) be used to calculate a running total in T-SQL. Further, SQL Server 2000 and earlier versions are still very prevalent in the world and you just might need to do a running total where no such windowing functions even exist.

To do things like a "Running Total" in SQL Server, you either have to resort to RBAR (pronounced "ree-bar" and is a "Modenism" for "Row By Agonizing Row") using some form of Cursor or While Loop or to some rather nasty set-based-looking code that actually isn't set-based. Most of that supposed set-based code can be thousands or even millions of times worse for performance than a simple Cursor. People have tried self joins and all manner of methods only to find out they've slammed the server into the wall where performance is involved. That type of code usually contains what is known as a "Triangular Join" and the reasons for it being so much worse than a Cursor or While Loop are explained in a previous article titled "Hidden RBAR: Triangular Joins". That article is located at the following URL:

http://www.sqlservercentral.com/articles/T-SQL/61539/

Test Table and Data

Those who know me well will tell you that I always test not only for accuracy and reliability, but for performance and scalability, as well. Both of those factors are extremely important for testing "Running Total" and similar "previous row" code because there are so many terrible ways to do it that seem to work well with small row counts that will choke servers even with minor increases in scalability.

With that in mind, let's build a million row test table to run the various demonstration code snippets against. Now, we could work with nice orderly temporal data for these tests, but let's make a total mess of things with some random data to really stress test the various solutions. Ordinarily, I'd build this data using a single SELECT/INTO just because of the blinding speed, but I don't want to just mess up the data... I also want to make a large number of page splits on the clustered index of the test table to simulate a poorly maintained database. To help make the page splits, we'll also cram each page as full as we can with a FILLFACTOR of 100 on the clustered index and insert the data using a loop. Basically, this will build a test table under the worst conditions possible. Well... except maybe for real life tables. 😉

Create the Test Table and Indexes

In the definition of the test table in the code below, you'll notice that I include some columns that won't originally have any data in them. Those columns are for things like running totals to be demonstrated later on in this article. The following code creates a test table called "TransactionDetail". Please read the embedded comments for additional details.

/*************************************************************************************
 Create the test table with a non-clustered Primary Key and a separate clustered index
 This code has been tested in SQL Server 2000 and 2005.
*************************************************************************************/--===== Do this testing in a nice, "safe" place that everyone has
    USE TempDB
GO
--===== If the test table already exists, drop it in case we need to rerun.
     -- The 3 part naming is overkill, but prevents accidents on real tables.
     IF OBJECT_ID('TempDB.dbo.TransactionDetail') IS NOT NULL
        DROP TABLE TempDB.dbo.TransactionDetail
GO
--===== Create the test table (TransactionDetail) with a NON clustered PK
 CREATE TABLE dbo.TransactionDetail
        (
        TransactionDetailID INT IDENTITY(1,1), --Temporal "tie-breaker"
        Date                DATETIME,
        AccountID           INT,
        Amount              MONEY,
        AccountRunningTotal MONEY,  --Running total across each account
        AccountRunningCount INT,    --Like "Rank" across each account
        NCID                INT,    --For "proof" later in the article
        CONSTRAINT PK_TransactionDetail_TransactionDetailID
                   PRIMARY KEY NONCLUSTERED (TransactionDetailID) 
                   WITH FILLFACTOR = 100
        )
GO
--===== Add a clustered index that will easily cause page splits because
     -- of the randomized data being inserted.  This index also represents
     -- the expected sort order of most of the code examples.
 CREATE CLUSTERED INDEX IXC_Transaction_AccountID_Date_TransactionDetailID
     ON dbo.TransactionDetail (AccountID, Date, TransactionDetailID)
--===== Add a non-clustered index on the NCID column sorted in 
     -- descending order.  This is for some "proofs" later on
     -- in the article.
 CREATE NONCLUSTERED INDEX IX_Transaction_NCID
     ON dbo.TransactionDetail (NCID DESC)
GO

Figure 2 - Build the Test Table

Because the data will be built in a highly randomized, non temporal order, the clustered index will not only suffer a large amount of index fragmentation, but it will also be subject to a huge number of page splits, as well. Page splits are a very bad thing because not only is the index fragmented by page splits, but they also cause SQL Server and the underlying disk system to "jump around" a lot to find data to be processed and that usually causes degradation of performance.

A full discussion on what page splits are is well beyond the scope of this article and, if you don't know what they are, you should search for the term in "Books Online" (the help documentation that comes with SQL Server) and on the internet for some of the detailed discussions on the subject. The short description is that when data is inserted into the underlying pages of a table, it's inserted into that page in the same order as defined by the clustered index definition for each batch of inserts. If the page is full, about half the data is moved to a new page and then the new row is inserted in the correct order. This is one of the reasons why people say that you cannot rely on the "natural order" of data because it can change at anytime.

Populate the Test Table with Randomized Data

Like I said, we want to build the worst-case data. We want the clustered index (and, therefore, the data) to be as out of "natural order" as possible and with page splits galore so we can ensure that all the methods we're going to test will actually work in the correct temporal order no matter what. To do that, we're going to use some techniques to create highly randomized data and, yes, we're going to use a While Loop to insert batches of that randomized data to simulate "real life". Those inserts of highly randomized batches of data will also cause a huge number of page splits because they will be inserted into full pages in a random fashion.

Special thanks goes out to Matt Miller, Peter Larson, and Michael Valentine Jones because, over time, they have all contributed to the current form of these randomization methods in one way or another.

Without further ado, here's the code to populate the million row test table with highly randomized, highly fragmented data...

/*************************************************************************************
 Populate the table using a rather slow method but one that's sure to cause lots of
 Page splits and that will fragment the table with over 99% fragmentation.
*************************************************************************************/--===== Preset the environment for appearance and speed
    SET NOCOUNT ON
--===== Populate the table in "segments" to force page splits.
     -- Normally this would NOT have a While loop in it.
     -- Because the While loop is there and page splits are happening,
     -- this takes a whopping 00:02:50 to create on my box.
  WHILE (ISNULL(IDENT_CURRENT('TransactionDetail'),0)) < 1000000
  BEGIN
         INSERT INTO dbo.TransactionDetail
                (Date, AccountID, Amount)
         SELECT TOP 10000
                --10 years worth of dates with times from 1/1/2000 to 12/31/2009
                CAST(RAND(CHECKSUM(NEWID()))*3653.0+36524.0 AS DATETIME) AS Date,
                --100 different account numbers
                ABS(CHECKSUM(NEWID()))%100+1,
                --Dollar amounts from -99.99 to + 99.99
                CAST(CHECKSUM(NEWID())%10000 /100.0 AS MONEY)
           FROM Master.dbo.SysColumns sc1
          CROSS JOIN Master.dbo.SysColumns sc2
    END
--===== Update the NCID column to be the reverse of the TransactionDetailID column
 UPDATE dbo.TransactionDetail
    SET NCID = 1000000 - TransactionDetailID + 1
GO

Figure 3 - Populate the Test Table in a Highly Randomized Fashion

Checking the Fragmentation and Page Split Levels

We're not going to spend a lot of time on this, but I wanted you to see just how badly the clustered index on our test table has been fragmented and to get a feel for how many pages have been split. Read the comments in the following code and then run it.

/*************************************************************************************
 Show the fragmentation and the page splits
*************************************************************************************/--===== Look for the "Logical Scan Fragmentation" here... 
     -- The higher the number, the worse the fragmentation is.
     -- Also notice the average page density is just a little over half full.
     -- The output from this command will show in the "Messages" tab.
   DBCC SHOWCONTIG ('TransactionDetail')
--===== If you compare "PagePID" to "PrevPagePID" and "NextPagePID", you can see that
     -- many page splits took place.  In an "unsplit" index, the PrevPagePID will
     -- usually be 1 less than the PagePid and the NextPagePID be 1 more than the 
     -- PagePID.  If they are not, there's a good chance that a page split occurred.
     -- The output from this command will will show either in the "Grid/Results" tab 
     -- or in the "Messages" tab if in the text mode.
   DBCC IND (0, 'TransactionDetail', 1)
GO

Figure 4 - Check for Fragmentation and Page Splits

DBCC SHOWCONTIG scanning 'TransactionDetail' table...
Table: 'TransactionDetail' (1957582012); index ID: 1, database ID: 2
TABLE level scan performed.
- Pages Scanned................................: 11127
- Extents Scanned..............................: 1399
- Extent Switches..............................: 11122
- Avg. Pages per Extent........................: 8.0
- Scan Density [Best Count:Actual Count].......: 12.51% [1391:11123]
- Logical Scan Fragmentation ..................: 99.10%
- Extent Scan Fragmentation ...................: 49.82%
- Avg. Bytes Free per Page.....................: 2524.0
- Avg. Page Density (full).....................: 68.82%
DBCC execution completed. If DBCC printed error messages, contact your system administrator.

Figure 5 - Results of DBCC SHOWCONTIG

Figure 5 shows the results from the DBCC SHOWCONTIG command in Figure 4 from the "Messages" tab of either Query Analyzer or SSMS. Notice that the "Logical Scan Fragmentation" is more than 99%. You just can't get much worse than that so far as fragmentation goes. This table is a real mess now, just like we wanted. You can also guess that some page splits have occurred because the "Avg Page Density" is only 68.82%. To see just how bad the page splits really are, take a look at the output from the DBCC IND command in Figure 4 on the GRID of either Query Analyzer or SSMS along with the following explanation (too large to list in this article).

You may be asking, "What is DBCC IND?" It's one of those nice undocumented commands that list a bunch of information about a table and its indexes. The DBCC IND command shows, among other things, the internal page numbers for the data pages (hence, clustered index) of our test table. Like the comments in the code state, compare "PagePID" to "PrevPagePID" and "NextPagePID". In an "un-split" clustered index on a brand new table such as this one, the PrevPagePID will usually be 1 less than the PagePid and the NextPagePID will be 1 more than the PagePID. If they are not, there's a good chance that a page split occurred during the inserts made to the new table especially in a controlled test environment.

Gail Shaw turned me on to the DBCC IND command. For more detailed information on DBCC IND, please see the following URL which is a part of Paul Randal's blog. My understanding is that Paul Rand wrote many of the DBCC commands and that probably makes him an authority on the subject... 😉

https://www.sqlskills.com/blogs/paul/

Planning the Running Total

Before we start writing code to create running totals, let's build a quick little bit of pseudo-code just so we can easily keep track of what it is that we're trying to accomplish for the running total for each account in the TransactionDetail table. You can easily translate the following procedural (RBAR) pseudo-code to virtually any code language that can read and write to the database or a file.

For each account:

1. Read a row from the table in ascending date and transaction ID order for this account.

2. If this is the first row for this account, set the running balance variable to the current amount.

Otherwise, add the amount from the row to the running balance variable.

3. Write the running balance variable back to the current row in the table.

4. Loop back to Step 1 above and repeat until all rows have been updated in order.

Using a Cursor to Solve for Running Totals

Let's write a cursor to solve the running total problem we just defined with the pseudo-code. In other words, let's write a cursor to do running totals by AccountID in temporal order by Date using the TransactionID as a TieBreaker...

/*************************************************************************************
 Straight forward cursor method to calculate the running total for each AccountID.
 Runs in almost 8 minutes on the million row test table on my 7 year old desktop 
 computer.
*************************************************************************************/--===== Suppress the auto-display of row counts for speed an appearance
   SET NOCOUNT ON
--===== Declare the cursor storage variables
DECLARE @Amount              MONEY
DECLARE @CurAccountID        INT
--===== Declare the working variables
DECLARE @PrevAccountID       INT
DECLARE @AccountRunningTotal MONEY
--===== Create the cursor with rows sorted in the correct
     -- order to do the running balance by account
DECLARE curRunningTotal CURSOR LOCAL FORWARD_ONLY
    FOR
 SELECT AccountID, Amount
   FROM dbo.TransactionDetail
--  WHERE AccountID <= 10 --Uncomment for "short" testing
  ORDER BY AccountID, Date, TransactionDetailID
   OPEN curRunningTotal
--===== Read the information from the first row of the cursor
  FETCH NEXT FROM curRunningTotal
   INTO @CurAccountID, @Amount
--===== For each account, update the account running total 
     -- column until we run out of rows.  Notice that the
     -- CASE statement resets the running total at the 
     -- start of each account.
  WHILE @@FETCH_STATUS = 0
  BEGIN
--===== Calculate the running total for this row
     -- and remember this AccountID for the next row
 SELECT @AccountRunningTotal = CASE 
                               WHEN @CurAccountID = @PrevAccountID 
                               THEN @AccountRunningTotal + @Amount 
                               ELSE @Amount 
                               END,
        @PrevAccountID = @CurAccountID
--===== Update the running total for this row
 UPDATE dbo.TransactionDetail 
    SET AccountRunningTotal = @AccountRunningTotal
  WHERE CURRENT OF curRunningTotal
--===== Read the information from the next row of the cursor
  FETCH NEXT FROM curRunningTotal
   INTO @CurAccountID, @Amount
    END --End of the cursor
--======== Housekeeping
     CLOSE curRunningTotal
DEALLOCATE curRunningTotal
GO

Figure 6 - Cursor to Solve the Running Total Problem

You probably noticed in Figure 6 that the SELECT statement that calculates the running total variable uses the @PrevAccountID and then also has code to update the @PrevAccountID variable in the same SELECT. With variables, SELECT's do things in the order listed just as CASE statements (kind of) do their evaluations in the order listed (except for data-type evaluations). It's another "undocumented feature" and if it makes you uncomfortable, just split the SELECT into two different SET statements. It'll be a bit slower, but you'll be comfortable. 😉

As a side bar, I've heard the rumor that multi-variable assignments using a single SELECT has been deprecated. That must be an ANSI thing because I've found no sign of such a thing in the deprecation list for either SQL Server 2005 or 2008. Check it out for yourself using the following URLs that lead straight to the horse's mouth. If I'm wrong, drop me a line and let me know.

http://msdn.microsoft.com/en-us/library/ms143729(SQL.90).aspx

http://msdn.microsoft.com/en-us/library/ms143729.aspx

Getting back to the Cursor code... Just like our problem definition stated, the CURSOR code reads a row, updates the running total variable, and then writes those results back out to the row in the table. It's a very tight loop and pretty fast if I do say so myself. It successfully and correctly updates the AccountRunningTotal column in the table in about 7 minutes and 40 seconds on my humble desktop PC. Not bad for a million rows (heh... yeah... right. Wait and see.).

Verifying the Results

Verifying the results of the running total calculation isn't difficult, either. What we simply need to do is a reverse calculation that we can trust and, since it would be nice to do it with a little speed, we'll let SQL Server do the verification for us.

The first thing we need to do is copy the data in the correct order from the test table into a verification table that has contiguous, sequential row numbers. Then, we can do a self-join on that table with a 1 row offset to verify all the calculations we did. Except when the AccountID changes, the AccountRunningTotal column should be equal to the previous row's AccountRunningTotal plus the current row's Amount. Copying the data can be done very quickly by using SELECT/INTO and the IDENTITY function which also works in all versions of SQL Server.

As yet another side bar, for those that still believe the myth that SELECT/INTO paralyzes the system tables of TempDB, be advised that although that was once true, it hasn't been true since SQL Server 6.5 SP 1. I don't, however, expect you to simply take my word for it, but it is well beyond the scope of this article to prove it to you with code. So, instead, take Microsoft's word for it. Please see the following URL and pay particular attention to where it says "NOTE: This problem does not apply to SQL Server 7.0 and later."

http://support.microsoft.com/kb/153441/EN-US/

I have actually tested the myth and, of course, the code for that test is well beyond the scope of this article, as well. When time allows and providing that no one beats me to it, I'll write a separate article on the subject of "The Myth of SELECT/INTO".

Here's the code to do the account running total verification. I've written it as a stored procedure because we'll need to use it more than once in this article. Again (as always), the details are in the comments...

USE TEMPDB
GO
CREATE PROCEDURE dbo.Verify AS
/*************************************************************************************
 Code to verify that the account running total calculation worked correctly.
 Please read the comments to see how it works.
*************************************************************************************/--===== Conditionally drop the verification table to make
     -- it easy to rerun the verification code
     IF OBJECT_ID('TempDB..#Verification') IS NOT NULL
   DROP TABLE dbo.#Verification
--===== Define a variable to remember the number of rows
     -- copied to the verification table
DECLARE @MyCount INT
--===== Copy the data from the test table into the
     -- verification table in the correct order.
     -- Remember the correct order with an IDENTITY.
 SELECT IDENTITY(INT,1,1) AS RowNum,
        AccountID,
        Amount,
        AccountRunningTotal
   INTO #Verification
   FROM dbo.TransactionDetail
  ORDER BY AccountID, Date, TransactionDetailID
--===== Remember the number of rows we just copied
 SELECT @MyCount = @@ROWCOUNT
--===== Check the running total calculations
 SELECT CASE 
            WHEN COUNT(hi.RowNum) + 1 = @MyCount
            THEN 'Account Running Total Calculations are correct'
            ELSE 'There are some errors in the Account Running Totals'
        END
   FROM #Verification lo
  INNER JOIN
        #Verification hi
     ON lo.RowNum + 1 = hi.RowNum
  WHERE (-- Compare lines with the same AccountID
         hi.AccountID = lo.AccountID
         AND hi.AccountRunningTotal = lo.AccountRunningTotal + hi.Amount)
     OR
        (-- First line of account has running total same as amount
         hi.AccountID <> lo.AccountID
         AND hi.AccountRunningTotal = hi.Amount)
GO

Figure 7 - Code to Verify the Account Running Total

If you run the code in Figure 8 (below), it well execute the code in Figure 7 and, after about 10 seconds, you should get the message "Account Running Total Calculations are correct". If there's anything wrong with the running total, you'll get the message "There are some errors in the Account Running Totals". In the case of the Cursor code, you will get the "correct" answer.

Here's how to run the code for Figure 7 to verify that the running total worked correctly on the TransactionDetail table:

--===== Verify the the running total worked on the 
     -- TransactionDetail table
    USE TEMPDB --Takes 10 seconds on my machine
   EXEC dbo.Verify
GO

Figure 8 - Code to Verify the Running Total

Resetting the Test Table

We could rerun the test table generation code between runs, but that would cause a change in the randomized data. Any changes in that data could affect the relative run times. Instead, we'll just set the calculated column data back to NULL. Since we'll be doing this a lot, I made a simple stored procedure to do the reset. Here's the code.

/**************************************************************************************
 This stored procedure will clear the calculated columns in the test table without
 disturbing the randomized data in the table so that we can repeat tests and use
 different methods without changing the test data.
**************************************************************************************/    USE TEMPDB
GO
 CREATE PROCEDURE dbo.ResetTestTable AS
 UPDATE dbo.TransactionDetail
    SET AccountRunningTotal = NULL,
        AccountRunningCount = NULL
GO

Figure 9 - Code to Reset the Running Total Columns

Run the code in Figure 9 to build the procedure. Then, run the following code anytime you want to clear the calculated columns in the test table. The code takes about 6 seconds to run. Remember that number...

EXEC dbo.ResetTestTable

Figure 10 - EXEC to Clear the Running Total Columns

Run the code in Figure 10 now because we're getting ready to demonstrate another method and we need the columns to be clear just to prove it works.

The (ugh!) Triangular Join Running Total Method

You'll soon see why I say "ugh!" here. This is one of those supposedly "Set-Based" methods that folks use to solve the running total problem that gives the term "Set-Based" a really bad name. Let's demo the code first and then we'll talk about what makes it so "ugh!" bad.

/*************************************************************************************
 This does a running total using the "Triangular Join" method.  It looks "Set-Based"
 because there's no explicit loop declared.  In fact, it's RBAR ON STEROIDS!
*************************************************************************************/ 
 SELECT td.AccountID,
        td.Date,
        td.TransactionDetailID,
        td.Amount,
        (--==== Triangular join to get the sum of previous rows for each row
         SELECT SUM(td2.Amount) 
           FROM dbo.TransactionDetail td2 
          WHERE td2.AccountID = td.AccountID
            AND td2.Date     <= td.Date
        ) AS AccountRunningTotal
   FROM dbo.TransactionDetail td
  WHERE td.AccountID = 1
  ORDER BY td.AccountID, td.Date, td.TransactionDetailID
GO

Figure 11 - Triangular Join Running Total

The little chunk of slick looking code in Figure 11 works correctly but it takes about 1 minute and 14 seconds to run on my box. "What's so bad with that?" you ask. Well, look carefully at the code. That amount of time is to calculate the running totals for just ONE AccountID with about 10,000 rows. To process all 100 accounts at that rate, the code would need to run for over 2 hours!

Heh... no wonder this bit of code gives "Set-Based" a bad name.

Again, for a much more detailed discussion on Triangular Joins and their devastating effect on performance and resources, please see the article titled "Hidden RBAR: Triangular Joins" at the following URL

http://www.sqlservercentral.com/articles/T-SQL/61539/

Introduction to Pseudo-Cursors

So, it appears that attempts to do something procedural, like doing a running total or anything else that involves carrying something forward from the "previous row", can only lead to some real performance problems. Looking back in this article, the clear winner (so far) for doing running totals is the Cursor method.

What if I told you that there's such a thing as a "Cursor" with no explicitly defined While Loop? What if I told you that "Set Based Loops" actually exist? What if I told you we CAN do procedural (RBAR) tasks in a set based manner?

Consider the following code, please...

 SELECT *
   FROM TempDB.dbo.TransactionDetail

Figure 12 - Simple SELECT: What does it do behind the scenes?

What does the code in Figure 12 actually do? Sure, we all know that it's going to return all the rows from the TransactionDetail table, but what does it do behind the scenes? Believe it or not, it's nothing more than a loop. Heh... that's right, you heard it here first! T-SQL is nothing more than some really easy code designed to loop through rows in a file (ie. table) just like any other file-centric code. The code in Figure 12 does nothing more than read 1 row at a time and display 1 row at a time. Sure, it's fast but, it's still nothing more than a loop in the code behind the scenes.

Now, tell me... is the following code a loop behind the scenes as well?

 UPDATE TempDB.dbo.TransactionDetail
    SET RunningCount = NULL

Figure 13 - Simple UPDATE: What does it do behind the scenes?

You bet it is. Just like SELECT's, UPDATE's have a loop in them behind the scenes. Because SELECT's and UPDATE's loop behind the scenes just like you would with a Cursor, some of us refer to them as "Pseudo-Cursors". You'll also hear people say things like, "Well, because of the Cursor in the UPDATE, yada-yada". What they're referring to is the looping that goes on behind the scenes and not an explicitly declared cursor.

By the way, thank you R.Barry Young. I used to call these things "Set Based Loops". I first heard the term "Pseudo-Cursor" from you and I'll admit that not only is it more correctly descriptive, it's also a heck of a lot easier to say. 😉

Now, here's an even stranger question. What order does a SELECT return its rows in? If you answered "Some order that cannot be guaranteed without an ORDER BY", you'd be absolutely correct. That's one of the reasons there's an ORDER BY clause for SELECTs... to guarantee the order of the return.

But, what about UPDATE's? There's no ORDER BY in the UPDATE command and we all know that trying to force the order of an UPDATE with a join to an ordered sub-query just isn't reliable (not to worry, we'll prove that later in the article). It also isn't guaranteed even in an ordered CTE (yep, we'll prove this later, as well).

So, what order do UPDATEs do their business in? Is it random or what?

Let's find out using a very special method...

Introducing the "Quirky Update"

Although I've been using the technique for years, the first time I heard the term "Quirky Update" was from an article on Simple-Talk.com by Robyn Page ( http://www.simple-talk.com/sql/learn-sql-server/robyn-pages-sql-server-cursor-workbench/ ). I've always called it just an "Update" but it does have some special characteristics deserving of such a special name. So, like Robyn, I'll refer to a special form of UPDATE as the "Quirky Update" and it, too, has a Pseudo-Cursor that works behind the scenes.

What's so special about an UPDATE in SQL Server to earn it the name of "Quirky Update"? Well, like any UPDATE in other database engines, it'll update a column of data but there are a couple of things that make it "quirky" in SQL Server. For one, it can also update variables at the same time it updates columns for each row it processes without an explicit Cursor. It IS a documented feature as seen in the following quote from Books Online where I've highlighted two items in red...

UPDATE

{

table_name WITH ( < table_hint_limited > [ ...n ] )

| view_name

| rowset_function_limited

}

SET

{ column_name = { expression | DEFAULT | NULL }

| @variable = expression

| @variable = column = expression } [ ,...n ]

{ { [ FROM { < table_source > } [ ,...n ] ]

[ WHERE

< search_condition > ] }

Let's use the ability of the "Quirky Update" to update variables and the underlying columnar data in the table to see what order a "Quirky Update" uses in the pseudo-cursor it uses behind the scenes.

--===== "Quirky Update" shows us the order that an UPDATE uses.
     -- Notice that on my box, this only takes 6 seconds
DECLARE @Counter INT
 SELECT @Counter = 0
 UPDATE TempDB.dbo.TransactionDetail
    SET @Counter = AccountRunningCount = @Counter + 1
   FROM TempDB.dbo.TransactionDetail WITH (TABLOCKX)
 OPTION (MAXDOP 1)
GO

Figure 14 - "Quirky" Update Produces a Running Count

If we run the code from Figure 14 with the Actual Execution Plan turned on, we see that it does a "Clustered Index Update" during the UPDATE even though there are no columns in the UPDATE from the Clustered Index. It also does a Clustered Index Scan to read the data.

Figure 15 - Simple Update Uses "Clustered Index Update"

That would be expected since we're updating the entire table and the clustered index is actually in the table. It doesn't prove anything though. Let's do a SELECT ordered by the same columns as in the clustered index and see what happens.

--===== Select all the rows in order by the clustered index
 SELECT *
   FROM TempDB.dbo.TransactionDetail
  ORDER BY AccountID, Date, TransactionDetailID
GO

Figure 16 - Manually Verify the Running Count

After you run the code in Figure 16 , look at the AccountRunningCount column. Even with the super fragmented, highly page split test table that we're using, the "Quirky Update" can indeed be "quirky" in that it will update in the same order as clustered index even in the presence of other indexes like the one that was created when we made a non-clustered primary key on the table.

No Need for Index Hints

As we'll see later on in the "ORDER BY" proofs, it's not the scan that matters, it's the actual update that matters. If it has a "Clustered Index Update" as the last step (on the left), then the UPDATE will be done in the same order as the Clustered Index. That means that no index hints like WITH (INDEX(0)) (forces a clustered index scan on the right of the execution plan) are required, ever. That's actually a good thing because the index hint just slows down the update down by a factor of 6. That's a lot of overhead for something that's not actually needed.

Table Modification by Others Must Be Prevented

It is, however, very important that no one INSERT, UPDATE, or DELETE anything in the table while trying to do a running total or other "previous row" calculation even if you're not using the "Quirky Update". While an UPDATE occurs, even on a whole table, it will usually start out with just row locks and eventually escalate to a table lock. While that's happening, someone could sneak in and do a modification. To prevent that, we can lock the whole table up front. Sure, it may cause the code to wait briefly until everyone is "out" of the table, but it's worth the very brief wait to ensure the running totals are done correctly... on a million rows. 😉

--===== "Quirky Update" with table hint to guarantee exclusive
     -- use of the table.
DECLARE @Counter INT
 SELECT @Counter = 0
 UPDATE TempDB.dbo.TransactionDetail
    SET @Counter = AccountRunningCount = @Counter + 1
   FROM TempDB.dbo.TransactionDetail WITH (TABLOCKX)
 OPTION (MAXDOP 1)
GO

Figure 17 - An Important Index Hint.

You should check Books Online for yourself, but TABLOCKX, once no one is using the table, prevents anyone else from "stepping in" on the table during the update. It all sounds brutal, but it runs so fast, I'll put up with the very short "outage" on a million rows especially when you compare it to how slow (almost 8 minutes) a Cursor is to do the same thing.

The bottom line for this section in the article is that when a Clustered Index is available, the "Quirky Update" will occur in the same order as the Clustered Index. There are some rules to follow to ensure this and a complete list of those rules is available near the end of this article.

Let's try what we've learned so far using the same Running Total algorithm as we used in the real Cursor code...

The "Quirky Update" Running Total

Before we start, let's clear the columns to be calculated in the test table. Execute the following code, please.

EXEC dbo.ResetTestTable
GO

Run the following code. Notice that it follows the same basic algorithm as the running total code we used for the real Cursor running total code except it relies on the pseudo-cursor looping that the "Quirky Update" does behind the scenes instead of the While Loop in a real Cursor.

/*************************************************************************************
 Pseduo-cursor Running Total update using the "Quirky Update" takes about 4 seconds
 on my box.
*************************************************************************************/--===== Supress the auto-display of rowcounts for speed an appearance
   SET NOCOUNT ON
--===== Declare the working variables
DECLARE @PrevAccountID       INT
DECLARE @AccountRunningTotal MONEY
--===== Update the running total for this row using the "Quirky Update"
     -- and a "Pseudo-cursor"
 UPDATE dbo.TransactionDetail 
    SET @AccountRunningTotal = AccountRunningTotal = CASE 
                                                     WHEN AccountID = @PrevAccountID 
                                                     THEN @AccountRunningTotal+Amount 
                                                     ELSE Amount 
                                                     END,
        @PrevAccountID = AccountID
   FROM dbo.TransactionDetail WITH (TABLOCKX)
 OPTION (MAXDOP 1)
GO

Figure 18 - Code for the "Quirky Update" Running Total

Before we verify that the running total worked correctly, let's take it one step further and modify the code to give us a running count for each AccountID as well. Think of it as adding a row number to each row for each account that starts over at 1 at the first row of each AccountID. If that sounds familiar, it should. We're going to do the same thing that ROW_NUMBER() OVER does in SQL Server 2005, but this method will also work in SQL Server 2000 and earlier.

First, clear the columns we want to calculate...

EXEC dbo.ResetTestTable

... and then run the following code to calculate both the running total and the running count by AccountID...

/*************************************************************************************
 Pseduo-cursor update using the "Quirky Update" to calculate both Running Totals and
 a Running Count that start over for each AccountID.
 Takes 24 seconds with the INDEX(0) hint and 6 seconds without it on my box.
*************************************************************************************/--===== Supress the auto-display of rowcounts for speed an appearance
   SET NOCOUNT ON
--===== Declare the working variables
DECLARE @PrevAccountID       INT
DECLARE @AccountRunningTotal MONEY
DECLARE @AccountRunningCount INT
--===== Update the running total and running count for this row using the "Quirky 
     -- Update" and a "Pseudo-cursor". The order of the UPDATE is controlled by the
     -- order of the clustered index.
 UPDATE dbo.TransactionDetail 
    SET @AccountRunningTotal = AccountRunningTotal = CASE 
                                                     WHEN AccountID = @PrevAccountID 
                                                     THEN @AccountRunningTotal + Amount 
                                                     ELSE Amount 
                                                     END,
        @AccountRunningCount = AccountRunningCount = CASE 
                                                     WHEN AccountID = @PrevAccountID 
                                                     THEN @AccountRunningCount + 1 
                                                     ELSE 1 
                                                     END,
        @PrevAccountID = AccountID
   FROM dbo.TransactionDetail WITH (TABLOCKX)
 OPTION (MAXDOP 1)
GO

Figure 19 - The "Quirky Update" Running Total and Running Count

For those playing along as we go, did you notice anything when you ran the code in Figure 19? Doing both the running total and the running count only took a second or two longer than just doing either one alone! It also only took a second or two longer than the "reset table" code takes and it's a straight update!

Ok. Let's verify that the running total worked. Let's run the same code that we used to verify the Cursor run, again...

--===== Verify the the running total worked on the 
     -- TransactionDetail table
    USE TEMPDB --Takes 10 seconds on my machine
   EXEC dbo.Verify
GO

Figure 20 - Verifying the Running Total for the Quirky Update

That should produce the message "Account Running Total Calculations are correct".

It doesn't matter if the clustered index is all fragged up or if it has just been rebuilt. The "Quirky Update" still does things in the correct order. What that means to us is that, although regular maintenance on the clustered index is strongly advised to save disk space by eliminating page splits, the "Quirky Update" method for calculating "previous row" problems like running totals and running counts is very tolerant of even a poor maintenance schedule.

ORDER BY and Extra "Warm Fuzzies"

I said earlier that the use of ORDER BY to establish the order of a running total would never work reliably and then kind of left you hanging but did say we'd prove it. This section of the article is also one of the major reasons why I rewrote the article. In the previous article, I forgot to drop the clustered index and that made it look like the ORDER BY was successful. We're going to correct that mistake on my part right now. We'll also see what happens when there's no Clustered Index.

When a Clustered Index is Present

Remember that we have a clustered index on the test table that has an order of AccountID, Date, TransactionDetailID. Let's try to do a running total in order by the NCID column only. Keep in mind that the NCID column has an index on it having no columns directly associated with the Clustered Index. There are a thousand different ways to try and do an explicitly ordered update and none of them work reliably. I'm only going to show you two because they cover the basis of most of the other methods for trying to do an ordered UPDATE. Keep in mind that UPDATE does not have its own ORDER BY so we have to pass the UPDATE through a vehicle that does. Here's the code (notice that the "reset table" code is built in)...

/*************************************************************************************
 Pseduo-cursor Running Total update using an "ordered" CTE that's not in the same 
 order as the clustered index.
*************************************************************************************/--===== Reset the running total columns
   EXEC dbo.ResetTestTable
--===== Supress the auto-display of rowcounts for speed an appearance
   SET NOCOUNT ON
--===== Reset the running total/count columns
   EXEC dbo.ResetTestTable --This isn't actually necessary in "real" code
--===== Declare the working variables
DECLARE @NCID         INT,
        @AccountRunningTotal MONEY,
        @AccountRunningCount INT
 SELECT @AccountRunningTotal = 0,
        @AccountRunningCount = 0
--===== Update the running total using multipart updates
     -- applied to an "ordered" CTE.
;WITH cteOrdered AS
(
 SELECT TOP 2147483648
        NCID,
        Amount,
        AccountRunningTotal,
        AccountRunningCount 
   FROM dbo.TransactionDetail
  ORDER BY NCID DESC --Don't forget... reverse order here
)
 UPDATE cteOrdered
    SET @AccountRunningTotal = AccountRunningTotal = @AccountRunningTotal + Amount,
        @AccountRunningCount = AccountRunningCount = @AccountRunningCount +1,
        @NCID = NCID 
   FROM cteOrdered WITH (TABLOCKX)
 OPTION (MAXDOP 1)
GO

Figure 21 - Ordered Update Attempt Used CTE

Let's do a real simple check to see if there's even a chance that worked...

--===== Show that the "ORDER BY" method didn't work.
     -- The running balance column makes no sense in this order.
     -- Don't forget... it's in reverse order
 SELECT TOP 100 * 
   FROM dbo.TransactionDetail
  ORDER BY NCID DESC

Figure 22 - Simple Verification by NCID (it failed)

We don't even need a verification routine to see how wrong that is. The running balance column makes absolutely no sense in order by NCID even though we told it to do the update in order by NCID.

So what order did it do the update in? Care to hazard a guess? Remember that I said that "Quirky Updates" on a single table will always update in order by the Clustered Index if a "Clustered Index Update" occurred? Well, guess what?

Figure 23 - Execution Plan from Attempted Ordered CTE

The last step was a "Clustered Index Update" even though we explicitly told it to sort by the indexed NCID column. Could it be possible that the UPDATE still occurred in the same order as the Clustered Index? Let's find out...

--===== Show that the "Quirky Update" overrode the ORDER BY
     -- and still performed the update in clustered index order.
 SELECT TOP 100 * 
   FROM dbo.TransactionDetail
  ORDER BY AccountID, Date, TransactionDetailID
GO

Figure 24 - Simple Verification by Clustered Index (it passed)

Now the running balance column makes a whole lot more sense because the SELECT in Figure 24 is sorted in the same order as the clustered index. If you run the verification code again, you'll see that the running total code passes but not because of the ORDER BY... it's because even though we tried to override the order with and explicit ORDER BY, the "Quirky Update" ignored all that and still did the update in Clustered Index order.

To summarize this section so it doesn't turn out to be 8 pages long, ORDER BY on an UPDATE doesn't work with the "Quirky Update". Neither do index hints (you can try that on your own). The good part about all of this is that the "Quirky Update" will override any and all of that. That's a pretty nice "warm fuzzy" when people ask you if you're sure it will work. 😉

When a Clustered Index is NOT Present

Rather than list all of the code again, let me just tell you to run the code from Figure 2 to build the table and the indexes EXCEPT don't include the Clustered Index. Then run the code from Figure 3 to populate the table just like before. What you'll end up with was highly randomized test data like before and everything is the same except for the missing Clustered Index.

Then run the code from Figure 21, the ordered update attempt by CTE. When you run the simple SELECT code from Figure 22, it looks like it worked. But, if you run the verification from Figure 10, the message "There are some errors in the Running Totals" will appear meaning that it didn't work.

If you try the same thing using an INDEX HINT to try to establish the order of the UPDATE, the answer will be the same... "There are some errors in the Running Totals".

To summarize, you cannot force an ordered update using ORDER BY or an INDEX HINT. If a Clustered Index is present, it will still do the update in Clustered Index order. If you don't have a Clustered Index on the table, you'll simply get bad answers.

The bottom line is, if you want to do running totals using a "Quirky Update", a Clustered Index with the correct sort order must be present. If you don't have one of the one that exists on a table cannot be changed, then you need to use SELECT/INTO to copy the correct data to a Temp Table and add the correct Clustered Index to that.

Speaking of SELECT/INTO...

Partitioned Table and View Problems

I just haven't had the time (yet) to do a deep dive on the problems associated with using the "Quirky Update" on partitioned tables and views. I don't feel too bad about that, though, because Lynn Pettis wrote just a dandy article on the subject. Here's the URL for that great and very complete article.

http://www.sqlservercentral.com/articles/T-SQL/65522/

The bottom line is, don't use the "Quirky Update" methods on Partitioned Tables or Views.

The work around (for now, anyway) is to simply copy the required columns to a Temp Table, add the required clustered index, and do the "Quirky Update" on that.

Problems with Parallelism and the Fix

You probably noticed... every example of "Quirky Update" code in this article used OPTION (MAXDOP 1). By its very definition, running total and running count problems are procedural (serial) in nature even when using the "Quirky Update". That is, they must be accomplished one row at a time. It is possible that parallelism could enter the picture for those updates. That would destroy the required "serial" or procedural nature that solving these problems require.

For that reason, it is absolutely imperative that you make it impossible for parallelism to occur whenever you use any of the "Quirky Update" methods. That means that every method MUST use OPTION (MAXDOP 1) as part of the code. I cannot stress enough how very important the use of this option is.

OMG! I BROKE IT!

Heh... kind of. Sometimes, the 3 part SET statement loses its mind during development depending on what you're trying to do. It WILL eventually happen to you as you're developing.

A very special thanks go out to Erik Eckhardt and Karl Schmitt for introducing me to this problem with the 3 part SET statement. Their recommendation is to never use the 3 part SET statement and after you see the following, you may agree that only 2 part SET statements should be used for the "Quirky Update".

The following code is supposed to do a "Quirky Update" where a running count starts at 0 and is supposed to count up by (1+2) or 3 for each count. The output is supposed to be 3, 6, 9, 12 but, instead, only counts by 3 for the first row and counts by 1 for all the other rows. Although it's a very strange way of doing a running count by 3, let's see that example of the 3 part SET statement losing its mind so you know what to look for if it ever happens to you. Here's the code...

--===== Create a 4 row test table variable and preset all the rows to zero.
     -- It also creates an "Expected" column which should be the value of
     -- the SomeInt column (starts a 3, counts by 3) when we're done.
DECLARE @OMG TABLE 
        (RowNum INT IDENTITY(1,1) PRIMARY KEY CLUSTERED, SomeInt INT, Expected INT)
 INSERT INTO @OMG (SomeInt, Expected) 
 SELECT 0,3 UNION ALL SELECT 0,6 UNION ALL SELECT 0,9 UNION ALL SELECT 0,12
--===== Create a variable to hold a number 
     -- and preset it to some known value
DECLARE @N INT, @Anchor INT
 SELECT @N = 0
 
--===== Do a 3 part "Quirky Update" that is supposed to count by 3's (1+2)
     -- but loses it's mind instead.  It only does the @N + 2 for the 
     -- first row.
 UPDATE @OMG
    SET @N      = @N + 1,           --Adds 1 to N 
        @N      = SomeInt = @N + 2, --"Forgets" to do @N + 2 after first row
        @Anchor = RowNum
   FROM @OMG --WITH (TABLOCKX) --Can't be used on a table variable
 OPTION (MAXDOP 1)
--===== Display the result. Should start at 3 and count by 3's.
     -- But it doesn't!!! It starts at 3 and counts by 1's.  It didn't do
     -- the @N + 2 for anything except the first row.
 SELECT * FROM @OMG
GO

Figure 25 - 3 Part Set Statement Sometimes "Forgets"

Here are the results from the code in Figure 25. As you can see, the values in SomeInt don't match the expected values in the Expected column. There's a very long winded explanation about this that I'll very gratiously step aside and let Erik and Karl write about, someday. For the time being, we only need to know it "broke"...

RowNum SomeInt Expected
   1      3       3
   2      4       6
   3      5       9
   4      6       12

Figure 26 - Bad results from 3 part update

So what do we do in such a case? Well, no one said that you HAD to use the 3 part SET statement. It's just one option that happens to work well for running totals and running counts. The "Quirky Update" will work just fine with all 2 part SET statements. All you have to do is get the formula working correctly the first time and then it will work correctly all the time. If we take the same problem that we couldn't get working correctly with the 3 part SET statement and convert it all to 2 part statements, it works as expected all the time as in the following code example...

--===== Create a 4 row test table variable and preset all the rows to zero.
     -- It also creates an "Expected" column which should be the value of
     -- the SomeInt column (starts a 3, counts by 3) when we're done.
DECLARE @OMG TABLE 
       (RowNum INT IDENTITY(1,1) PRIMARY KEY CLUSTERED, SomeInt INT, Expected INT)
 INSERT INTO @OMG (SomeInt, Expected) 
 SELECT 0,3 UNION ALL SELECT 0,6 UNION ALL SELECT 0,9 UNION ALL SELECT 0,12
--===== Create a variable to hold a number 
     -- and preset it to some known value
DECLARE @N INT, @Anchor INT
 SELECT @N = 0
 
--===== Do a "Quirky Update" with only 2 part SET statements
     -- to overcome the problem.
 UPDATE @OMG
    SET @N      = @N + 1, --Adds 1 to N (works every time)
        @N      = @N + 2, --Adds another 2 to N (works every time)
        SomeInt = @N,     --Updates SomeInt with N
        @Anchor = RowNum
   FROM @OMG --WITH (TABLOCKX) --Can't be used on a table variable
 OPTION (MAXDOP 1)
--===== Display the result. Should start at 3 and count by 3's.
     -- This time it worked.
 SELECT * FROM @OMG
GO

Figure 27 - 2 Part Set Statements Always Work as Expected

As you can see in Figure 27 above, we've split the 3 part SET statement into two separate components. If you run that code, you'll get the correct results every time. Of course, at this point, we all realize two things... 1) If this is done in SQL Server 2005 and above, the use of ROW_NUMBER() may be the better option (might not even need an UPDATE) and 2) if this is done in SQL Server 2000, one of the correct ways to write the code would be as follows...

--===== Create a 4 row test table variable and preset all the rows to zero.
     -- It also creates an "Expected" column which should be the value of
     -- the SomeInt column (starts a 3, counts by 3) when we're done.
DECLARE @OMG TABLE 
       (RowNum INT IDENTITY(1,1) PRIMARY KEY CLUSTERED, SomeInt INT, Expected INT)
 INSERT INTO @OMG (SomeInt, Expected) 
 SELECT 0,3 UNION ALL SELECT 0,6 UNION ALL SELECT 0,9 UNION ALL SELECT 0,12
--===== Create a variable to hold a number 
     -- and preset it to some known value
DECLARE @N INT, @Anchor INT
 SELECT @N = 0
 
--===== Do a "Quirky Update" with the 3 part SET statement done
     -- the right way.
 UPDATE @OMG
    SET @N = SomeInt = @N + 3, --Adds 3 to N and updates SomeInt with N
        @Anchor = RowNum
   FROM @OMG --WITH (TABLOCKX) --Can't be used on a table variable
 OPTION (MAXDOP 1)
--===== Display the result. Should start at 3 and count by 3's.
     -- This time it worked.
 SELECT * FROM @OMG
GO

Figure 28 - A correct way to count by 3

You could also write it as 2 part SET statements, as well.

The RULES

Here's a complete list of the rules for using the quirky update. These are some very strict but simple and easy to remember rules that you have to follow in order to guarantee the success of such code...

1. CLUSTERED INDEX MUST BE PRESENT IN THE CORRECT ORDER: You must have a clustered index on the table that is in the correct order to support the updates in the required procedural order. It does NOT have to be the primary key but it MUST be a clustered index. Do not attempt to do a "Quirky Update" for running totals against a heap table or a non-clustered index because the order of the update is simply not guaranteed on those structures. Although it may look like it works here and there, it WILL eventually fail because of the "Merry-go-Round" effect that some non-clustered indexes have. Further, do NOT attempt to use an ORDER BY to try to force an order contrary to the clustered index. In fact, if you try a "Quirky Update" using an ORDER BY and the table has a clustered index, the ORDER BY will actually be ignored and the update will occur in clustered index order.

2. PARALLELISM MUST BE PREVENTED: You MUST prevent parallelism from occurring. Therefore, you MUST include OPTION (MAXDOP 1) in any such code.

3. DON'T WORK AGAINST PARTITIONED STRUCTURES: There are a number of huge problems associated with attempting such running total code on partitioned tables and views. Rather, use SELECT/INTO to quickly copy the data to a Temp Table, add the necessary clustered index to the Temp Table, and do the running total queries on the Temp Table. If you need to update the original partitioned structure with the results of that, be sure to include something absolutely unique in the Temp Table so you can correctly join to the original table for the final update. That's normally pretty easy to do even if it's not a part of the clustered index. All of that will still take a lot less time than even using a Cursor.

4. USE THE TABLOCKX HINT: If you're updating a local Temp Table, only the current session has access to that Temp Table which means that the TABLOCKX hint really isn't required. BUT, it's absolutely necessary if you're updating a permanent table that others have access to because any interceding insert, update, or delete would destroy the order of the update (even if it was just a Cursor update!). For that reason, even if the entire table isn't being updated, you MUST use the TABLOCKX hint. It's just easier to remember to always use the TABLOCKX hint.

5. DO NOT USE JOINS: I didn't mention it before, but there should be no joins in the update. Even with the INDEX(0) hint, you're taking a chance on something going wrong because of the join. Only do running total updates on a single table with no joins. If joins are absolutely necessary to get the correct data for the running total, assemble that data in a Temp Table and do the running total on that single Temp Table. Then, if necessary, copy the running totals back to the original tables.

6. YOU MUST HAVE AN "ANCHOR" COLUMN: ALWAYS include an "anchor" column using just a "2 part" update (ie, @variable = key column) even if it's not used in the calculation. The anchor must be a column in the clustered index. My recommendation would be to try to use the first column of the clustered index as the anchor. The use of an anchor column isn't normally necessary but there are some unpredictable cases where an error in the running total can occur without it. Since it provides virtually no impact on performance, it's a bit insane to not include it.

7. DO NOT USE ORDER BY: ORDER BY does not work reliably with the "Quirky Update". If there's a clustered index on the table, the update will be done in the order of the clustered index even if it's wrong. If the correct Clustered Index is not available on a given table and cannot be made available, copy the required data to a Temp Table, add the correct clustered index, and do the running totals on that Temp Table. You'll still get the job done 9 times faster than using a cursor.

8. DO NOT USE INDEX HINTS TO TRY TO FORCE ORDER: Just like when trying to use ORDER BY, if a Clustered Index is available on the table, the "Quirky Update" will still occur in Clustered Index order (which is why the old INDEX(0) hint is NOT required). If the correct Clustered Index is not available on a given table and cannot be made available, copy the required data to a Temp Table, add the correct clustered index, and do the running totals on that Temp Table. You'll still get the job done 9 times faster than using a cursor.

9. GET IT RIGHT: Depending on what you're doing, the 3 part SET statement may simply not work correctly. If this happens, don't despair. Just change the 3 part formula to two 2 part formulas and try again. Once you "get it right" (whether you've used 3 part, mixed, or just 2 part SET statements), it won't change over time or when the data changes or when you move the code to production.

10. TEST! I can't stress the importance of this enough even in "normal" SQL. If you implement untested code, you're absolutely begging for a major change in your career. TEST!

Because of the discussion I've added to each of the rules above, the rules look to be very complex. If you look at just the bold leader for each rule, they're actually pretty simple in code.

Go back and look at the previous code in Figure 19. Look how simple it is and how easy it is to follow the rules. If you really want to convince yourself how easy this really is, go back and look at the original cursor code in Figure 6 for just one column. 😉 Both use the same methodology but the Cursor code is a lot more complicated not to mention a lot slower, as well.

Conclusion:

When it comes to running totals, the "Quirky Update" blows the doors off of all other T-SQL methods. In SQL Server 2000, the "Quirky Update" is the only high speed method to do the equivalent of "partitioned" ROW_NUMBERs and RANK without the slothfulness of some explicit RBAR loop or a "Triangular Join". The "Quirky Update" does in about 6 seconds that which takes a cursor almost 8 minutes to accomplish. To put things into perspective, that means the "Quirky Update" is about 80 times faster than a cursor when done "in place" in the same table.

I've not used the "Quirky Update" in SQL Server 2008 because I simply haven't used 2008, yet. That notwithstanding, I have it from many reliable sources, including many on this forum, that the "Quirky Update" works in 2008 just like it works in 2005. Of course, I'll test the heck out of it myself when I finally do install 2008 on my desktop.

Admittedly, the "Quirky Update" (update variables and columns at the same time) is a highly controversial method. Many folks simply don't trust it and about as many simply condemn it as a "hack that uses undocumented features". However, without exception and to date, if they follow the rules, none of them have been able to break it... once the necessary formulas have been made to work correctly, it always works. You may be among those that don't trust it and I certainly understand the healthy skepticism.

With skepticism in mind, you do have choices... you can always use a RBAR solution to solve problems like this and put up with the resulting performance problems and some rather lengthy code. To be fair, though, the operation of Cursors is fully documented and "guaranteed" to be accurate provided you write the Cursor code correctly (hmmm... sounds familiar ;-)). If you're using SQL Server 2005, you could always write a CLR.

Although not demonstrated in this article, it was mentioned that could also write some code that updates running balance columns as each row is inserted using triggers and the like. Of course, that's also a form of "RBAR times two" but it may be very appropriate for the task at hand. It may also NOT lend itself well to batch inserts.

Another choice is to use both the "Quirky Update" and the code that verifies it. Even if you copy the required data to a Temp Table (20 secs), do the update (6 secs), use the verification code (10 secs), and then update the original table (14 secs), it will still be almost 10 times faster than RBAR. On my machine, the Cursor method took almost 8 minutes to do just 1 running total. Running the full gambit of "Quirky Update" and "Verification" code including the creation of a separate verification table (which I'd only do for "in-place" updates) only took 50 seconds. Remember... that's on a million rows on a 7 year old non-server quality box.

Of course, if you do grow to trust the "Quirky Update" as I have over time, you can convert a million row, 8 minute long, resource-sucking RBAR run to a 6 second blip on the resources that no one has been able to break yet. 😉

In the Works...

In Microsoft Word, this article is about 22 pages long. There just wasn't room to put all of the possible uses for the "Quirky Update" in a single article. I'm in the process of writing another article on the other uses of the "Quirky Update" called "Solving Previous Row & Strange Sequencing Problems in T-SQL". We'll also cover a rather brilliant "non-Quirky Update" method that I saw in an SQL Mag contest to do some really odd groupings in SQL Server 2005 and up using ROW_NUMBER.

Thanks for listening, folks.

--Jeff Moden

"RBAR is pronounced "ree-bar" and is a "Modenism" for "Row-By-Agonizing-Row"

Solving the Running Total and Ordinal Rank Problems (Rewritten)

* NOTICE * 24 Feb 2011 - This article is in the process of (hopefully) the final "rewrite". * NOTICE *

For those that remember the first article:

Introduction:

Warning:

What is a "Running Total"?

Test Table and Data

Create the Test Table and Indexes

Populate the Test Table with Randomized Data

Checking the Fragmentation and Page Split Levels

Planning the Running Total

Using a Cursor to Solve for Running Totals

Verifying the Results

Resetting the Test Table

The (ugh!) Triangular Join Running Total Method

Introduction to Pseudo-Cursors

Introducing the "Quirky Update"

No Need for Index Hints

Table Modification by Others Must Be Prevented

The "Quirky Update" Running Total

ORDER BY and Extra "Warm Fuzzies"

When a Clustered Index is Present

When a Clustered Index is NOT Present

Partitioned Table and View Problems

Problems with Parallelism and the Fix

OMG! I BROKE IT!

The RULES

Conclusion:

In the Works...

Resources

Rate

Share

Categories

Share

Rate

Solving the Running Total and Ordinal Rank Problems (Rewritten)

*** NOTICE *** 24 Feb 2011 - This article is in the process of (hopefully) the final "rewrite". *** NOTICE ***

For those that remember the first article:

Introduction:

Warning:

What is a "Running Total"?

Test Table and Data

Create the Test Table and Indexes

Populate the Test Table with Randomized Data

Checking the Fragmentation and Page Split Levels

Planning the Running Total

Using a Cursor to Solve for Running Totals

Verifying the Results

Resetting the Test Table

The (ugh!) Triangular Join Running Total Method

Introduction to Pseudo-Cursors

Introducing the "Quirky Update"

No Need for Index Hints

Table Modification by Others Must Be Prevented

The "Quirky Update" Running Total

ORDER BY and Extra "Warm Fuzzies"

When a Clustered Index is Present

When a Clustered Index is NOT Present

Partitioned Table and View Problems

Problems with Parallelism and the Fix

OMG! I BROKE IT!

The RULES

Conclusion:

In the Works...

Resources

Rate

Share

Categories

Share

Rate

Related content

Cross Tabs and Pivots, Part 1 – Converting Rows to Columns

SQL Server Rounding Tips and Tricks

Find a String in a String in SQL Server

A simple trick for “Block Comment” syntax in SSMS

T-SQL: Why “It Depends”

* NOTICE * 24 Feb 2011 - This article is in the process of (hopefully) the final "rewrite". * NOTICE *