December 30, 2009 at 12:26 am
Thanks everyone for your help regarding the dynamic crosstab procedure. I was successful in modifying the section SQL3 to provide cummulative values by month using what Jeff provided as the starting-point.
Attached is the final procedure incorporating the changes to support cummulative values by month.
December 30, 2009 at 7:44 am
tm3 (12/30/2009)
Thanks everyone for your help regarding the dynamic crosstab procedure. I was successful in modifying the section SQL3 to provide cummulative values by month using what Jeff provided as the starting-point.Attached is the final procedure incorporating the changes to support cummulative values by month.
Ummmm.... yes... that will work but you need to be made aware of something. The following code...
WHERE (DATEADD(mm, DATEDIFF(mm, 0, Date), 0)
<= DATEADD(mm, DATEDIFF(mm, 0, a.Date), 0)) AND (Account
... creates what is known as a "Triangular Join". Please see the following article as to what a server crushing effect they can have...
http://www.sqlservercentral.com/articles/T-SQL/61539/
I'll also cite a common mistake people (even some of the "old timers") make concerning the subject of "Triangular Joins" ... they'll say that "there won't ever be more than just a couple of hundred rows." Maybe, maybe not. You can't predict what the future will bring and you certainly can't predict when someone under the gun to get something developed won't copy your code for a much larger application with a whole lot more rows.
--Jeff Moden
Change is inevitable... Change for the better is not.
December 30, 2009 at 8:04 am
tm3 (12/29/2009)
Jeff, much appreciated if you can help me achieve the desire result of a cummulative total by month for the networth table I require.Hopefully others requiring cummulative totals will benefit from your/our work on this procedure. I'll wait for your next reply before I make any further attempts to solve the problem. The interaction between SQL2 and SQL3 are a bit more complex than I bargained for at this point in my SQL coding development.
Todd
tm3, as long as you don't know that something you are trying is beyond your ability, you will never stop trying. When you stop trying, it just became beyond your ability. Just wait till you see what happens when your variable crosstab meets quirky update.
Greg
_________________________________________________________________________________________________
The glass is at one half capacity: nothing more, nothing less.
December 30, 2009 at 8:09 am
Heh... to wit... the bumble bee doesn't know that it's wings are too small and it's body is too heavy to fly.
--Jeff Moden
Change is inevitable... Change for the better is not.
December 30, 2009 at 9:13 am
Jeff, if you have a better approach, I am all ears at this point. I have been struggling with this procedure for about 4 days now. I also learned last night after proudly publishing the successful build that it has a flaw -- displays a zero in subsequent months that have no transactions. Here is an example:
ACCOUNT............DATE..............TOTAL
------------- ------------- -----------------
ACCT1.........10/01/09.............$100.00
ACCT1.........11/01/09.............$ 40.00
ACCT1.........01/01/10.............$ 50.00
... the output table from the latest procedure I published looks like this.
ACCOUNT........ OCT 2009 ......... NOV 2009 .... DEC 2009 .... JAN 2010
------------- ------------- --------------- ------------ -------------
ACCT1............. $100.00 .......... $140.00.........$0.00 .......... $190.00
... I would like it to display the following (even when there a no records for DEC 2009):
ACCOUNT........ OCT 2009 ......... NOV 2009 .... DEC 2009.... JAN 2010
------------- ------------- --------------- ------------ ------------
ACCT1............. $100.00 .......... $140.00......... $140.00 ...... $190.00
Any ideas how to solve the triangular join issue and "$0.00" value being displayed in periods without any transactions?
I could solve the "$0.00" value issue by looping through the records output from the procedure in my VB.net code but it would be great if the procedure did the heavy lifting.
Todd
December 30, 2009 at 9:17 am
tm3 (12/30/2009)
Jeff, if you have a better approach, I am all ears at this point. I have been struggling with this procedure for about 4 days now. I also learned last night after proudly publishing the successful build that it has a flaw -- displays a zero in subsequent months that have no transactions. Here is an example:ACCOUNT............DATE..............TOTAL
------------- ------------- -----------------
ACCT1.........10/01/09.............$100.00
ACCT1.........11/01/09.............$ 40.00
ACCT1.........01/01/10.............$ 50.00
... the output table from the latest procedure I published looks like this.
ACCOUNT........ OCT 2009 ......... NOV 2009 .... DEC 2009 .... JAN 2010
------------- ------------- --------------- ------------ -------------
ACCT1............. $100.00 .......... $140.00.........$0.00 .......... $190.00
... I would like it to display the following (even when there a no records for DEC 2009):
ACCOUNT........ OCT 2009 ......... NOV 2009 .... DEC 2009.... JAN 2010
------------- ------------- --------------- ------------ ------------
ACCT1............. $100.00 .......... $140.00......... $140.00 ...... $190.00
Any ideas how to solve the triangular join issue and "$0.00" value being displayed in periods without any transactions?
I could solve the "$0.00" value issue by looping through the records output from the procedure in my VB.net code but it would be great if the procedure did the heavy lifting.
Todd
Don't loop and don't divert to VB.net code... it's not necessary. I can easily fix this, but can't do right now... I'm at work. Will tackle this tonight. In the meantime, read the article I posted about "Running Totals" so you can understand what it is that will be done.
--Jeff Moden
Change is inevitable... Change for the better is not.
December 30, 2009 at 11:01 am
tm3 (12/30/2009)
Jeff, if you have a better approach, I am all ears at this point. I have been struggling with this procedure for about 4 days now. I also learned last night after proudly publishing the successful build that it has a flaw -- displays a zero in subsequent months that have no transactions. Here is an example:ACCOUNT............DATE..............TOTAL
------------- ------------- -----------------
ACCT1.........10/01/09.............$100.00
ACCT1.........11/01/09.............$ 40.00
ACCT1.........01/01/10.............$ 50.00
... the output table from the latest procedure I published looks like this.
ACCOUNT........ OCT 2009 ......... NOV 2009 .... DEC 2009 .... JAN 2010
------------- ------------- --------------- ------------ -------------
ACCT1............. $100.00 .......... $140.00.........$0.00 .......... $190.00
... I would like it to display the following (even when there a no records for DEC 2009):
ACCOUNT........ OCT 2009 ......... NOV 2009 .... DEC 2009.... JAN 2010
------------- ------------- --------------- ------------ ------------
ACCT1............. $100.00 .......... $140.00......... $140.00 ...... $190.00
Any ideas how to solve the triangular join issue and "$0.00" value being displayed in periods without any transactions?
I could solve the "$0.00" value issue by looping through the records output from the procedure in my VB.net code but it would be great if the procedure did the heavy lifting.
Todd
Sorry for popping in and running, but I don't have time to supply you a code solution quite yet either.
Your basic answer (to fix the zeroes in certain months), is to ensure you have at least a 0$ transaction in each month you're reporting on, before you create the running total. If those exist when you fire off Jeff's running total logic, then the running total will include them in the calculation and carry over the previous value.
I'm actually fairly jammed up for the rest of the day, but I may take a look later tonight if Jeff doesn't beat me to it.
----------------------------------------------------------------------------------
Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?
December 30, 2009 at 10:00 pm
Jeff:
I just read the article you referred to (http://www.sqlservercentral.com/articles/T-SQL/68467/) and I now have an appreciation for you plan to fix the running total by month procedure. You actually update the record with the running total so that the records can be accumulated quickly and accurately thus avoiding the triangular join issue.
I look forward to receiving your updates.
Todd
December 31, 2009 at 1:48 pm
Sorry... made a "performance" error and I took the code that was here down... I'll be back in a few.
--Jeff Moden
Change is inevitable... Change for the better is not.
December 31, 2009 at 3:10 pm
All right... here's the test table generator. It gen's a million rows for 100 accounts across 10 years. Read the comments about some of the indexing applied. I also deleted a whole month's worth of rows to test for your "missing dates" problem.
--=====================================================================================================================
-- Create a test table and populate it.
-- Read the warning below... I'm not responsible for your data if you wreck it.
-- I tried to make everything as safe as possible. If you change it, you could overwrite real data.
-- This section is not a part of the solution. It's just to build a test table in TempDB.
-- Take note of some of the indexes built for performance.
--=====================================================================================================================
--===== Do this test in a nice safe place.
-- Warning... if you leave this out, it may/will damage any real tables.
USE TempDB
--===== Conditionally drop the test table
IF OBJECT_ID('TempDB.dbo.tblTestNetworth','U') IS NOT NULL
DROP TABLE TempDB.dbo.tblTestNetworth
--===== Create the test table
CREATE TABLE TempDB.dbo.tblTestNetworth
(
ID INT IDENTITY(1,1) PRIMARY KEY CLUSTERED,
Account NCHAR(50), --Bad design here. SHOULD BE NVARCHAR!!!
Date DATETIME,
Total MONEY
)
--===== Specify the test data constraints
DECLARE @StartDate DATETIME,
@EndDate DATETIME,
@NumOfAccounts INT,
@MaxAmount INT
SELECT @StartDate = 'Jan 2000', --Inclusive
@EndDate = 'Jan 2010', --Not inclusive
@NumOfAccounts = 100, --Max is 9999
@MaxAmount = 100
--===== This populates the test table with a whole bunch of highly randomized but constrained test data.
-- Don't let the TOP number scare you... this doesn't take long.
INSERT INTO TempDB.dbo.tblTestNetworth
SELECT TOP 1000000
Account = 'Acct'+RIGHT('0000'+CAST(ABS(CHECKSUM(NEWID()))%100+1 AS VARCHAR(4)),4),
Date = RAND(CHECKSUM(NEWID()))*DATEDIFF(dd,@StartDate,@EndDate)+CAST(@StartDate AS DATETIME),
Total = ROUND(RAND(CHECKSUM(NEWID()))*100,2)
FROM Master.dbo.SysColumns sc1,
Master.dbo.SysColumns sc2
--===== Well add an index for some wicked speed
CREATE NONCLUSTERED INDEX IX_tblTestNetworth_ReportComposite01 ON dbo.tblTestNetworth
(Account ASC, Date ASC, Total)
--===== We also need a Tally table for this project
IF OBJECT_ID('TempDB.dbo.Tally','U') IS NULL
BEGIN
--===== Create and populate the Tally table on the fly
SELECT TOP 1000000
IDENTITY(INT,1,1) AS N
INTO dbo.Tally
FROM Master.dbo.SysColumns sc1
CROSS JOIN Master.dbo.SysColumns sc2
--===== Add a Primary Key to maximize performance
ALTER TABLE dbo.Tally
ADD CONSTRAINT PK_Tally_N
PRIMARY KEY CLUSTERED (N) WITH FILLFACTOR = 100
--===== Allow the general public to use it
GRANT SELECT ON dbo.Tally TO PUBLIC
END
--===== This is just to test for a "missing month"
DELETE FROM dbo.tblTestNetworth WHERE Date >='20080301' AND Date < '20080401'
GO
Here's a much more optimized solution that what I posted before. It's totally "self healing" so that if new accounts are added, they're automatically added to the output. [font="Arial Black"]Again, read the comments.[/font] All of the pre-aggregation was done outside of the dynamic SQL this time simply because it made life a bit easier to write the various ("Divide'n'Conquer") parts for. On my very humble machine, it returns a year's report in about 5 seconds. That includes ALL aggregations and the final dynamic cross tab. After stats and cache pick up on things, additional year long reports come in about 2 seconds. On a "real" server, those times should be greatly reduced unless a wad of other code is eating up resources.
/**********************************************************************************************************************
Solve the problem... this is where the rubber meets the road.
You get to turn it into a stored procedure.
**********************************************************************************************************************/
--===== These will become parameters in the final code
DECLARE @StartDate DATETIME
DECLARE @EndDate DATETIME
SET @StartDate = 'Dec 2008'
SET @EndDate = 'Jan 2008'
--=====================================================================================================================
-- Presets
--=====================================================================================================================
--===== Supress the auto-display of row counts to prevent false error reporting
SET NOCOUNT ON
--===== This variable gives us "room" to swap dates if needed
DECLARE @SwapDate DATETIME
--===== Declare the variables that will contain the dynamic SQL.
-- I decided to use the VARCHAR version (no QUOTENAME allowed, though)
DECLARE @SQL1 VARCHAR(8000),
@SQL2 VARCHAR(8000),
@SQL3 VARCHAR(8000)
--===== If the dates are not in the correct order, swap them...
-- WITHOUT using an IF
SELECT @SwapDate = @EndDate,
@EndDate = @StartDate,
@StartDate = @SwapDate
WHERE @EndDate < @StartDate
--===== "Normalize" the start and end dates to reflect full month ranges
SELECT @StartDate = DATEADD(mm,DATEDIFF(mm,0,@StartDate),0),
@EndDate = DATEADD(mm,DATEDIFF(mm,0,@EndDate)+1,0) --Dates less than this date will be used
--=====================================================================================================================
-- Pre-aggregate the data in a temp table so we can do the running totals.
-- This is accomplished very quickly because we don't actually care about displaying the data at this point.
-- "Divide'n'Conquer" is one of the most importance aspects of achieving high performance code.
--=====================================================================================================================
--===== Conditionally drop the working table
IF OBJECT_ID('TempDB..#Work','U') IS NOT NULL
DROP TABLE #Work
--===== Copy only the data necessary from the original table to the working table.
-- A cross tab creates all possible combinations of accounts and monthly date
-- ranges. Also note how the new columns for totals are formed during the process.
SELECT ctrl.Account,
ctrl.MonthStart,
ctrl.NextMonthStart,
ctrl.MonthName,
CAST(0 AS MONEY) AS Total,
CAST(0 AS MONEY) AS RunningTotal
INTO #Work
FROM ( --=== Create all accounts and dates whether missing data or not using
-- a cross join between the two derived tables below.
SELECT accts.Account,
dates.MonthStart,
dates.NextMonthStart,
STUFF(CONVERT(CHAR(11),dates.MonthStart,100),4,3,'') AS MonthName
FROM ( --=== Find all accounts in the table so we can report "0" activity
SELECT DISTINCT Account --This takes a lot of time for what it does
FROM dbo.tblTestNetworth
WHERE Account > ''
)accts
CROSS JOIN
( --=== Find all the months in the desired range whether they have data or not
SELECT DATEADD(mm,DATEDIFF(mm,0,@StartDate)+(t.N-1),0) AS MonthStart,
DATEADD(mm,DATEDIFF(mm,0,@StartDate)+(t.N ),0) AS NextMonthStart
FROM dbo.Tally t
WHERE t.N <= DATEDIFF(mm,@StartDate,@EndDate)+1
)dates
)ctrl
--===== Create the very necessary clustered index to control the order of UPDATE for the "Quirky Update" later on.
CREATE CLUSTERED INDEX IX_#Work_Account_MonthDate
ON #Work (Account,MonthStart,NextMonthStart)
--===== Now that we have the "baseline" of what we want to actually aggregate,
-- do the aggregation
UPDATE w
SET Total = ( --=== Correlated subquery uses the available indexes very effectively
-- for this summary report
SELECT SUM(ISNULL(worth.Total,0))
FROM dbo.tblTestNetworth worth
WHERE worth.Account = w.Account
AND worth.Date >= w.MonthStart
AND worth.Date < w.NextMonthStart
)
FROM #Work w
--=====================================================================================================================
-- Calculate the running totals by Account and MonthDate using the very high speed "Quirky Update".
--=====================================================================================================================
--===== Declare the necessary local variables
DECLARE @PrevAcct VARCHAR(50),
@RunTotal MONEY
--===== Do the "Quirky Update" which works just like such a running total update in any other language.
UPDATE #Work
SET @RunTotal = RunningTotal = CASE
WHEN Account = @PrevAcct
THEN ISNULL(Total,0) + @RunTotal
ELSE ISNULL(Total,0)
END,
@PrevAcct = Account
FROM #Work WITH (TABLOCKX)
OPTION (MAXDOP 1)
--=====================================================================================================================
-- Finally, do a dynamic Cross-Tab to display the data.
--=====================================================================================================================
--===== Create the "static" section of the code
SELECT @SQL1 = 'SELECT CASE WHEN GROUPING(Account)=1 THEN ''Total'' ELSE Account END AS Account,'+CHAR(10)
--===== The "Hard" Part - Concatenation to Form the Date Columns
SELECT @SQL2 = COALESCE(@SQL2+',' + CHAR(10),'')
+ 'SUM(CASE WHEN MonthName=' + QUOTENAME(d.MonthName,'''')
+ ' THEN RunningTotal ELSE 0 END)[' + d.MonthName + ']'
FROM ( --=== Produces the list of MonthNames in the mmm yyyy format
SELECT N,
STUFF(CONVERT(CHAR(11),DATEADD(mm, N-1, @StartDate),100),4,3,'') AS MonthName
FROM dbo.Tally
WHERE N <= DATEDIFF(mm,@StartDate,@EndDate)
) d
ORDER BY d.N
--===== Create the "Mostly Static" section of the code
SELECT @SQL3 = CHAR(10) + 'FROM #Work GROUP BY Account WITH ROLLUP'
--===== Print the SQL we just built in the messages tab just in case
-- we need to trouble shoot it
PRINT @SQL1+@SQL2+@SQL3
--===== And now we execute the SQL
EXEC (@SQL1+@SQL2+@SQL3)
--Jeff Moden
Change is inevitable... Change for the better is not.
December 31, 2009 at 6:17 pm
Very nice!!! It worked perfectly and I really appreciate the assistance Jeff.
I would have never figured this out on my own and would have probably tried manipulating the records using VB.net which as you said would have been very inefficient.
Happy new year!
Todd
December 31, 2009 at 7:44 pm
tm3 (12/31/2009)
Very nice!!! It worked perfectly and I really appreciate the assistance Jeff.I would have never figured this out on my own and would have probably tried manipulating the records using VB.net which as you said would have been very inefficient.
Happy new year!
Todd
Happy New Year, Todd. Thanks for the feedback.
--Jeff Moden
Change is inevitable... Change for the better is not.
December 31, 2009 at 8:04 pm
For those interested in what has been done on this thread, several "technologies" have been brought together to make this all happen. If you'd like to read about them, here's a full list of articles that cover those technologies...
[font="Arial Black"]Performance Tuning: Concatenation Functions and Some Tuning Myths[/font][/url]
[font="Arial Black"]The "Numbers" or "Tally" Table: What it is and how it replaces a loop.[/font]
[/url][font="Arial Black"]Cross Tabs and Pivots, Part 1 – Converting Rows to Columns[/font][/url]
[font="Arial Black"]Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs[/font][/url]
... and here's the article on why a "Triangular Join" is a bad thing...
[font="Arial Black"]Hidden RBAR: Triangular Joins[/font][/url]
... last but not least, here's how to get better answers to your posts more quickly than you could imagine...
[font="Arial Black"]Forum Etiquette: How to post data/code on a forum to get the best help[/font][/url]
--Jeff Moden
Change is inevitable... Change for the better is not.
January 3, 2010 at 10:53 pm
Jeff:
After further testing, I have found the following strange situation with the procedure edits you provided:
1) if I use a date range of start date: 10/03/2005 end date: 01/03/2010, the procedure builds the table as expected.
2) if I use a date range of start date: 09/03/2005 end date: 01/03/2010, the procedure generates the following error:
Incorrect syntax near the keyword 'FROM'. Incorrect syntax near the keyword 'with'. If this statement is a common table expression, an xmlnamespaces clause or a change tracking context clause, the previous statement must be terminated with a semicolon.
I rebuilt the tally table to make sure sufficient month placeholders were provided but the error remains. Is it possible the procedure only accepts a date range for a "fixed" number of periods and the one additional month resulting from 09/03/2005 breaks the code?
January 4, 2010 at 1:05 am
tm3 (1/3/2010)
Jeff:After further testing, I have found the following strange situation with the procedure edits you provided:
1) if I use a date range of start date: 10/03/2005 end date: 01/03/2010, the procedure builds the table as expected.
2) if I use a date range of start date: 09/03/2005 end date: 01/03/2010, the procedure generates the following error:
Incorrect syntax near the keyword 'FROM'. Incorrect syntax near the keyword 'with'. If this statement is a common table expression, an xmlnamespaces clause or a change tracking context clause, the previous statement must be terminated with a semicolon.
I rebuilt the tally table to make sure sufficient month placeholders were provided but the error remains. Is it possible the procedure only accepts a date range for a "fixed" number of periods and the one additional month resulting from 09/03/2005 breaks the code?
The problem is that there's an 8k limit on the SQL and @SQL2 is probably running out of room. You can get some relief by changing long names like "RunningTotal" to just "RT" or "R" but that won't last long. The best way would be to split up @SQL2 based on the number of expected rows according to the given date range.
--Jeff Moden
Change is inevitable... Change for the better is not.
Viewing 15 posts - 16 through 30 (of 33 total)
You must be logged in to reply to this topic. Login to reply