• Just following up... here's a cool problem that I just ran across that I was able to answer because I'd done something like it before...

    Here's the problem...

    You have a payment table that accumulates payment records for a week at a time... the agreement you have with the bank clearing house is that you can send as many files as you want with as many records as you want but you can only send the files once a week and no file shall have any more than 1,000,000 dollars worth payments in it.  Because business has been good, this week's payment table has 200,000 records in it ranging from 1 dollar to 100 dollars.  Determine which rows you will send as the first file by PK (an INT IDENTITY column for this example).  Additionaly, you must provide the PK, the payment amount, and a running total in sorted order because the bank wants it that way in order for your company to be able to actually get paid.

    There's only two restrictions.. you cannot use an external app because there isn't one and the company doesn't want to spend the resources to build/test one (YOU will get fired if you do make one even on your own time because your boss is a jerk).  The other is that you cannot mod the payment table because the 3rd party vendor that wrote it says it will break their app if you add a column to it because they used some type of schema binding in their compiled code to prevent tampering with the tables they made.

    I'll even provide the test data...

    --============================================================================

    -- Create a test table

    --============================================================================

    --===== If the test exists, drop it so we can rerun if we need to

         IF OBJECT_ID('dbo.jbmRunTotTest') IS NOT NULL

            DROP TABLE dbo.jbmRunTotTest

    --===== Suppress auto-display of rowcounts for appearance and speed

        SET NOCOUNT ON

    --===== Create and populate the jbmRunTotTest table on the fly

     SELECT TOP 200000 

            IDENTITY(INT,1,5) AS PK,

            CAST(RAND(CAST(NEWID() AS VARBINARY))*99+1 AS DECIMAL(7,2)) AS ColA

       INTO dbo.jbmRunTotTest

       FROM Master.dbo.SysColumns sc1,

            Master.dbo.SysColumns sc2

    --===== Add a Primary Key to maximize performance

      ALTER TABLE dbo.jbmRunTotTest

        ADD CONSTRAINT PK_jbmRunTotTest_PK PRIMARY KEY CLUSTERED (PK)

    --===== Allow the general public to use it

      GRANT ALL ON dbo.jbmRunTotTest TO PUBLIC

    So, how to solve this one?  A self-joined correlated sub-query would take about a month (took 29 seconds on only 9,999 records and get's exponentially worse).  Hmmmm.... maybe we'll have to resort to a RBAR WHILE loop or even a Cursor... nah...

    Again, we just want to return one batch of ID's that contain no more than 1,000,000 dollars worth of payments (we'll worry about the multiple batches some other time)... here's a nasty fast way to do it with a temp table and a little help from SQL Server's proprietary UPDATE statement...

    --============================================================================

    -- Return all PK's (other cols returned for bank proof) that have a running

    -- total less than some predetermined number (1 batch)

    --============================================================================

    --===== If the working temp table already exists, drop it for reruns

         IF OBJECT_ID('TempDB..#Temp') IS NOT NULL

            DROP TABLE #Temp

    --===== Clear any cached data so we can get a true measure of how fast this is

       DBCC DROPCLEANBUFFERS

       DBCC FREEPROCCACHE

    --===== Declare local variables

    DECLARE @StartTime DATETIME --< Just to measure run duration

        SET @StartTime = GETDATE()

    DECLARE @MaxVal INT

        SET @MaxVal = 1000000 --< This is the predetermined max running total

    DECLARE @RunTot DECIMAL(38,2)   --< This let's us calc the running total

        SET @RunTot = 0

    --===== Create and populate the temporary working table on the fly

     SELECT IDENTITY(INT,1,1) AS RowNum,

            CAST(PK AS INT) AS PK, --< The Primary Key Col from your table

            ColA, --< The column that contains the value from your table

            CAST(0 AS DECIMAL(38,2)) AS RunTotal --< To be calculated

       INTO #Temp --< The new temporary working table

       FROM jbmRunTotTest t1 --< Our test table (sub your table name here)

    --===== Add a new clustered Primary Key because every table should have one

         -- because UPDATE doesn't like an ORDER BY

      ALTER TABLE #Temp

        ADD PRIMARY KEY CLUSTERED (RowNum)

    --===== Update the temporary working table with the running total in RowNum order

     UPDATE t1

        SET @RunTot = RunTotal = @RunTot+t1.ColA

       FROM #Temp t1

    --===== Select a batch of rows where the running total is less than the predetermined max

     SELECT *

       FROM #Temp

      WHERE RunTotal <= @MaxVal

    --===== All done, display the duration

      PRINT CONVERT(VARCHAR(12),DATEADD(ms,DATEDIFF(ms,@StartTime,GETDATE()),0),114)

          + ' Run Duration (hh:mm:ss:mil)'

    Is it bad design where I had to do a work around?  I don't think so... it's just everyday normal stuff.  The use of the temp table allows this process to find the first batch of ID's and the other required info in only 3.36 seconds on a single CPU at 1.8Mhz with 1 IDE hard drive and only 2 gig of memory.  Now I don't know about you but I'm thinking that, because of the restrictions, a temp table was the only way to go.  And, I'll go a little further and say that I think it's going to be tough for anyone to beat the performance the temp table allowed the code to run with.

    But that could be just a temporary thing

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)