• Luis Cazares (2/4/2014)


    ... Why wouldn't you want to use a nice pre-aggregated cross tab approach?

    SELECT

    SUM(CASE qtr WHEN 1 THEN sales END) AS q1,

    SUM(CASE qtr WHEN 2 THEN sales END) AS q2,

    SUM(CASE qtr WHEN 3 THEN sales END) AS q3,

    SUM(CASE qtr WHEN 4 THEN sales END) AS q4

    FROM (SELECT qtr, SUM(sales) sales

    FROM #p

    GROUP BY qtr)p;

    I'm sorry Luis that I missed this question... I was in a hurry to leave work last night, saw that you had replied to my post and did not finish reading it before thanking you (I knew from the first sentence you wrote what I was doing wrong). I finished your post last night.

    Thank you Jeff for jumping into the thread, I would have likely missed Luis' question otherwise. As a token of my thanks I sent you some snow.

    Ironically, a conversation started last night when I, again, asked my friend/co-worker, Rob, if he read those Jeff Moden Articles I emailed him. Blah, blah... I was showing him why I use the technique in Jeff's article (Cross Tabs and Pivots, Part 1 – Converting Rows to Columns[/url]) instead of PIVOT. I was going to walk him through a the diferent techniques (including the pre-aggregations), do some tests, look at query plans but got stuck at the Pivot. We actually had the table #p filled with 1,000,000 rows when I got stuck on PIVOT (I never use PIVOT). I did not post the complete code...

    IF OBJECT_ID('tempdb..#p') IS NOT NULL DROP TABLE #p;

    CREATE TABLE #p(id int primary key, qtr tinyint, sales int);

    DECLARE @rows int = 1000000;

    WITH iTally(n) AS

    (SELECT TOP(@rows) ROW_NUMBER() OVER (ORDER BY (SELECT ($)))

    FROM sys.all_columns a CROSS JOIN sys.all_columns b

    )

    INSERT #p

    SELECTn,

    ceiling(4*rand(convert(varbinary, newid()))),

    ceiling(80*rand(convert(varbinary, newid())))+20

    FROM iTally;

    --DBCC FREEPROCCACHE

    SET NOCOUNT ON;

    SET STATISTICS TIME ON;

    --what I need

    SELECT

    SUM(CASE qtr WHEN 1 THEN sales END) AS q1,

    SUM(CASE qtr WHEN 2 THEN sales END) AS q2,

    SUM(CASE qtr WHEN 3 THEN sales END) AS q3,

    SUM(CASE qtr WHEN 4 THEN sales END) AS q4

    FROM #p;

    --where I was stuck

    SELECT [1] AS q1, [2] AS q2, [3] AS q3, [4] AS q4

    FROM

    (SELECT qtr, SUM(sales) sales

    FROM #p

    GROUP BY qtr) AS p

    PIVOT(

    SUM(sales)

    FOR qtr IN ([1],[2],[3],[4])

    ) AS pivottbl;

    SET STATISTICS TIME OFF;

    I have to run (it has not stopped snowing and we're being kicked out). I am going to chime back in later when I'm at home. If you get a moment (I know it's not snowing where you are Luis) run this code above. I have some more questions, such as, "why no parallelism with the cross tab" ? What does the query optimized assume that a serial version of this query will be faster?

    Again, I'll chime in again later. Thanks to both of you!

    Edit: Grammar.

    "I cant stress enough the importance of switching from a sequential files mindset to set-based thinking. After you make the switch, you can spend your time tuning and optimizing your queries instead of maintaining lengthy, poor-performing code."

    -- Itzik Ben-Gan 2001