Pivot Table Help Needed

  • I also lack the oppurtunity to call the SP twice in a row.

    I get an error "index already exists".


    N 56°04'39.16"
    E 12°55'05.25"

  • Sandy (10/23/2007)


    Hey Jeff :),

    Great Standard of Code,

    I really feel its the way need to be done.

    but i have a small issue here,

    when i am trying to copy your code to my SQL editor its

    mess up all the code,

    How can i copy the your code in systematic way?

    I mean the way it present here.

    Cheers!

    Sandy.

    Not sure if it works everywhere but for Jeff's posts you can paste them into MS Word and then copy from word to QA and most of the formatting follows though.


  • Peter Larsson (10/23/2007)


    I also lack the oppurtunity to call the SP twice in a row.

    I get an error "index already exists".

    I thnk it would make sense to move the create index out of the sp and into the setup where the table is created.


  • Peter Larsson (10/23/2007)


    Jeff, excellent code but I think you should start using QUOTENAME function for those having brackets, spaces or even single quotes in their table/column names.

    Yep... that would work, too.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Sandy (10/23/2007)


    Hey Jeff :),

    Great Standard of Code,

    I really feel its the way need to be done.

    but i have a small issue here,

    when i am trying to copy your code to my SQL editor its

    mess up all the code,

    How can i copy the your code in systematic way?

    I mean the way it present here.

    Sandy.

    Heh... it's this bloody new forum code they're using.

    If you copy'n'paste fromt the code window into MS Word and replace ^l (cirumflex with lower case "L") with ^p, at least the lines will end correctly instead of appearing as a single line of code. Unfortunately, leading spaces are NOT preserved either. I gotta write to Tony and let him know that's a pretty big problem for us.

    And, thank you for the compliment, Sandy.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Peter Larsson (10/23/2007)


    I also lack the oppurtunity to call the SP twice in a row.

    I get an error "index already exists".

    I'll check it out, Peter... thanks for the feedback.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • mrpolecat (10/23/2007)


    Peter Larsson (10/23/2007)


    I also lack the oppurtunity to call the SP twice in a row.

    I get an error "index already exists".

    I thnk it would make sense to move the create index out of the sp and into the setup where the table is created.

    Ah, bugger... thanks for the heads up... I'll go back and fix that, as well... I gotta stop posting stuff at 2 in the morning 😛

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • I copy your code straight from word without any replacing and it works great. I don't lose any leading spaces.


  • Just for S&G I removed your index completely and it didn't change performance time. Still 20 seconds. Here is the table I am using to test with if anybody is interested.

    SELECT TOP 1000000

    RowNum = IDENTITY(INT,1,1),

    SomeInt = CAST(RAND(CAST(NEWID() AS VARBINARY))*50000+1 AS INT),

    SomeString = CHAR(STR(RAND(CAST(NEWID() AS VARBINARY))*25+65)),

    SomeDate = dateadd(dd,convert(int,(365 * rand(CAST(NEWID() AS VARBINARY)))),'1/1/2007')

    INTO dbo.PVTest

    FROM Master.dbo.SysColumns sc1,

    Master.dbo.SysColumns sc2 --Lack of join criteria makes this a CROSS-JOIN

    ALTER TABLE dbo.PVTest

    ADD PRIMARY KEY CLUSTERED (RowNum)


  • rog pike (10/23/2007)


    RAC will never be amongst the fastest solutions. It is a simple tradoff, performance for functionality. RAC does not build a single SELECT statement to execute against a summary table. It would be virtually impossible to design a system (in t-sql) that could do everything RAC can do in a single framework and with a single pass thru the data /summary table where the end result is encapsulated in a single SELECT. But if that 'could' be done then it would be considerably faster. As a developer you understand the delicate balance between performance, functionality and stability. And you know there are no free lunches 🙂

    best,

    No doubt... RAC does everything...

    ... slower...

    I already have enough performance impinged tools... but thanks anyway. You still haven't answered my question, though... how long does the million row example take?

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • mrpolecat (10/23/2007)


    Just for S&G I removed your index completely and it didn't change performance time. Still 20 seconds. Here is the table I am using to test with if anybody is interested.

    I moved the index in my code example... thanks again for the heads up.

    I applogize for not doing the thorough "runs first time every time" testing that I normally do... it was pretty late that day. Not a good reason, though.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • I think that is how it is supposed to work. You do all the hard stuff 😀 and then the rest of us come along and QA it. A few tweaks and we go to production. I'll be selling your code soon by spamming all the other threads.:w00t:


  • Sandy (10/23/2007)


    Hey Jason Tontz,

    In SQL 2005, you can use this Query.

    Hmmm... I wonder how Pivot in 2k5 would do against these simple 3 value pivots on a million rows? Too bad I don't have 2k5, yet, so I could find out.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff Moden (10/23/2007)


    Sandy (10/23/2007)


    Hey Jason Tontz,

    In SQL 2005, you can use this Query.

    Hmmm... I wonder how Pivot in 2k5 would do against these simple 3 value pivots on a million rows? Too bad I don't have 2k5, yet, so I could find out.

    TESTING TESTING TESTING....did someone mention testing? hehe

    So...I saw this laying out there...and I just couldn't help myself. I also had a hunch which I decided to test out.

    First things first - I ran the "little" test you asked about, and after fixing the PIVOT syntax (I wasn't getting the correct results from Sandy's syntax), and....the CASE statement won...by 178 ms. The raw results were 672ms vs 860 ms.

    Now - we've been down this path before so I KNOW that several people (you KNOW who you are) are just itching to take those numbers and run with it (PIVOT is 28% slower than CASE, etc...). But - I wasn't satisfied with the result, too easy, not enough effort for it.

    In the words of Tim "the tool man" Taylor, the test "lacked power....so I rewired it"....hehe.

    screw 3 results, let's go with 20 results. Screw 1M, go for 10M. and lest someone say it's reusing something - let's free the processor cache after each. Oh - and let's see if we can find an indexing scheme that helps.

    Here's the testing scheme I used:

    use test

    go

    --set this up

    drop table #testpivot

    go

    create table #testpivot (rid int identity(1,1) not null, coID int not

    null,prodID int not null, orderamount money not null)

    --alter table #testpivot

    --add primary key (rid) with fillfactor=100

    go

    insert #testpivot(coid,prodid,orderamount)

    select top 5000000

    cast(rand(cast(newid() as varbinary)) *20 as integer)+1,

    cast(rand(cast(newid() as varbinary)) *50 as integer)+1,

    cast(rand(cast(newid() as varbinary)) *35000 as money)+1

    FROM Master.dbo.SysColumns sc1,

    Master.dbo.SysColumns sc2

    go 2

    create clustered index pivot_pk on #testpivot(coid,prodid,rid)

    --testing - good ol' fashioned case statements - multiple varieties

    select '10M records, 20 companies over 50 products'

    --first with "no helper indexes" - rely solely on clustered index

    --variety #1

    declare @g datetime

    select @g=getdate()

    select prodid,

    sum(case when coid=1 then orderamount else 0 end) sum1,

    sum(case when coid=2 then orderamount else 0 end) sum2,

    sum(case when coid=3 then orderamount else 0 end) sum3,

    sum(case when coid=4 then orderamount else 0 end) sum4,

    sum(case when coid=5 then orderamount else 0 end) sum5,

    sum(case when coid=6 then orderamount else 0 end) sum6,

    sum(case when coid=7 then orderamount else 0 end) sum7,

    sum(case when coid=8 then orderamount else 0 end) sum8,

    sum(case when coid=9 then orderamount else 0 end) sum9,

    sum(case when coid=10 then orderamount else 0 end) sum10,

    sum(case when coid=11 then orderamount else 0 end) sum1,

    sum(case when coid=12 then orderamount else 0 end) sum2,

    sum(case when coid=13 then orderamount else 0 end) sum3,

    sum(case when coid=14 then orderamount else 0 end) sum4,

    sum(case when coid=15 then orderamount else 0 end) sum5,

    sum(case when coid=16 then orderamount else 0 end) sum6,

    sum(case when coid=17 then orderamount else 0 end) sum7,

    sum(case when coid=18 then orderamount else 0 end) sum8,

    sum(case when coid=19 then orderamount else 0 end) sum9,

    sum(case when coid=20 then orderamount else 0 end) sum10

    from #testpivot

    group by prodid

    select 'case method',datediff(ms,@g,getdate()),'no index'

    --pivot

    go

    DBCC FREEPROCCACHE

    declare @g datetime

    select @g=getdate()

    SELECT

    [1],[2] ,[3] ,[4],[5],[6],[7],[8],[9],[10],

    [11],[12] ,[13] ,[14],[15],[16],[17],[18],[19],[20]

    FROM (select coid,prodid,orderamount from #testpivot) t

    PIVOT (sum(orderamount) FOR coid IN ([1], [2],

    [3],[4],[5],[6],[7],[8],[9],[10],[11], [12],

    [13],[14],[15],[16],[17],[18],[19],[20])

    ) AS OrdProductPivot

    select 'pivot method',datediff(ms,@g,getdate()),'no index'

    go

    DBCC FREEPROCCACHE

    declare @g datetime

    --variety #2

    select @g=getdate()

    create index t_pivot4 on #testpivot(prodID) include (coid,orderamount)

    select @g=getdate()

    select prodid,

    sum(case when coid=1 then orderamount else 0 end) sum1,

    sum(case when coid=2 then orderamount else 0 end) sum2,

    sum(case when coid=3 then orderamount else 0 end) sum3,

    sum(case when coid=4 then orderamount else 0 end) sum4,

    sum(case when coid=5 then orderamount else 0 end) sum5,

    sum(case when coid=6 then orderamount else 0 end) sum6,

    sum(case when coid=7 then orderamount else 0 end) sum7,

    sum(case when coid=8 then orderamount else 0 end) sum8,

    sum(case when coid=9 then orderamount else 0 end) sum9,

    sum(case when coid=10 then orderamount else 0 end) sum10,

    sum(case when coid=11 then orderamount else 0 end) sum1,

    sum(case when coid=12 then orderamount else 0 end) sum2,

    sum(case when coid=13 then orderamount else 0 end) sum3,

    sum(case when coid=14 then orderamount else 0 end) sum4,

    sum(case when coid=15 then orderamount else 0 end) sum5,

    sum(case when coid=16 then orderamount else 0 end) sum6,

    sum(case when coid=17 then orderamount else 0 end) sum7,

    sum(case when coid=18 then orderamount else 0 end) sum8,

    sum(case when coid=19 then orderamount else 0 end) sum9,

    sum(case when coid=20 then orderamount else 0 end) sum10

    from #testpivot

    group by prodid

    select 'case method',datediff(ms,@g,getdate()),'prodid','coid+amt'

    --pivot

    go

    DBCC FREEPROCCACHE

    declare @g datetime

    select @g=getdate()

    SELECT

    [1],[2] ,[3] ,[4],[5],[6],[7],[8],[9],[10],

    [11],[12] ,[13] ,[14],[15],[16],[17],[18],[19],[20]

    FROM (select coid,prodid,orderamount from #testpivot) t

    PIVOT (sum(orderamount) FOR coid IN ([1], [2],

    [3],[4],[5],[6],[7],[8],[9],[10],[11], [12],

    [13],[14],[15],[16],[17],[18],[19],[20])

    ) AS OrdProductPivot

    select 'pivot method',datediff(ms,@g,getdate()),'prodid','coid+amt'

    drop index #testpivot.t_pivot4

    go

    declare @g datetime

    DBCC FREEPROCCACHE

    --variety #3

    create index t_pivot3 on #testpivot(prodID,coid) include (orderamount)

    select @g=getdate()

    select prodid,

    sum(case when coid=1 then orderamount else 0 end) sum1,

    sum(case when coid=2 then orderamount else 0 end) sum2,

    sum(case when coid=3 then orderamount else 0 end) sum3,

    sum(case when coid=4 then orderamount else 0 end) sum4,

    sum(case when coid=5 then orderamount else 0 end) sum5,

    sum(case when coid=6 then orderamount else 0 end) sum6,

    sum(case when coid=7 then orderamount else 0 end) sum7,

    sum(case when coid=8 then orderamount else 0 end) sum8,

    sum(case when coid=9 then orderamount else 0 end) sum9,

    sum(case when coid=10 then orderamount else 0 end) sum10,

    sum(case when coid=11 then orderamount else 0 end) sum1,

    sum(case when coid=12 then orderamount else 0 end) sum2,

    sum(case when coid=13 then orderamount else 0 end) sum3,

    sum(case when coid=14 then orderamount else 0 end) sum4,

    sum(case when coid=15 then orderamount else 0 end) sum5,

    sum(case when coid=16 then orderamount else 0 end) sum6,

    sum(case when coid=17 then orderamount else 0 end) sum7,

    sum(case when coid=18 then orderamount else 0 end) sum8,

    sum(case when coid=19 then orderamount else 0 end) sum9,

    sum(case when coid=20 then orderamount else 0 end) sum10

    from #testpivot

    group by prodid

    select 'case method',datediff(ms,@g,getdate()),'prodid+coid','amt'

    --pivot

    go

    DBCC FREEPROCCACHE

    declare @g datetime

    select @g=getdate()

    SELECT

    [1],[2] ,[3] ,[4],[5],[6],[7],[8],[9],[10],

    [11],[12] ,[13] ,[14],[15],[16],[17],[18],[19],[20]

    FROM (select coid,prodid,orderamount from #testpivot) t

    PIVOT (sum(orderamount) FOR coid IN ([1], [2],

    [3],[4],[5],[6],[7],[8],[9],[10],[11], [12],

    [13],[14],[15],[16],[17],[18],[19],[20])

    ) AS OrdProductPivot

    select 'pivot method',datediff(ms,@g,getdate()),'prodid+coid','amt'

    drop index #testpivot.t_pivot3

    go

    declare @g datetime

    DBCC FREEPROCCACHE

    --variety #4

    create index t_pivot2 on #testpivot(coID,prodid) include (orderamount)

    select @g=getdate()

    select prodid,

    sum(case when coid=1 then orderamount else 0 end) sum1,

    sum(case when coid=2 then orderamount else 0 end) sum2,

    sum(case when coid=3 then orderamount else 0 end) sum3,

    sum(case when coid=4 then orderamount else 0 end) sum4,

    sum(case when coid=5 then orderamount else 0 end) sum5,

    sum(case when coid=6 then orderamount else 0 end) sum6,

    sum(case when coid=7 then orderamount else 0 end) sum7,

    sum(case when coid=8 then orderamount else 0 end) sum8,

    sum(case when coid=9 then orderamount else 0 end) sum9,

    sum(case when coid=10 then orderamount else 0 end) sum10,

    sum(case when coid=11 then orderamount else 0 end) sum1,

    sum(case when coid=12 then orderamount else 0 end) sum2,

    sum(case when coid=13 then orderamount else 0 end) sum3,

    sum(case when coid=14 then orderamount else 0 end) sum4,

    sum(case when coid=15 then orderamount else 0 end) sum5,

    sum(case when coid=16 then orderamount else 0 end) sum6,

    sum(case when coid=17 then orderamount else 0 end) sum7,

    sum(case when coid=18 then orderamount else 0 end) sum8,

    sum(case when coid=19 then orderamount else 0 end) sum9,

    sum(case when coid=20 then orderamount else 0 end) sum10

    from #testpivot

    group by prodid

    select 'case method',datediff(ms,@g,getdate()),'coID+prodid','amt'

    --pivot

    go

    DBCC FREEPROCCACHE

    declare @g datetime

    select @g=getdate()

    SELECT

    [1],[2] ,[3] ,[4],[5],[6],[7],[8],[9],[10],

    [11],[12] ,[13] ,[14],[15],[16],[17],[18],[19],[20]

    FROM (select coid,prodid,orderamount from #testpivot) t

    PIVOT (sum(orderamount) FOR coid IN ([1], [2],

    [3],[4],[5],[6],[7],[8],[9],[10],[11], [12],

    [13],[14],[15],[16],[17],[18],[19],[20])

    ) AS OrdProductPivot

    select 'pivot method',datediff(ms,@g,getdate()),'coID+prodid','amt'

    drop index #testpivot.t_pivot2

    go

    declare @g datetime

    DBCC FREEPROCCACHE

    --variety #5

    create index t_pivot on #testpivot(coID) include (prodid,orderamount)

    select @g=getdate()

    select prodid,

    sum(case when coid=1 then orderamount else 0 end) sum1,

    sum(case when coid=2 then orderamount else 0 end) sum2,

    sum(case when coid=3 then orderamount else 0 end) sum3,

    sum(case when coid=4 then orderamount else 0 end) sum4,

    sum(case when coid=5 then orderamount else 0 end) sum5,

    sum(case when coid=6 then orderamount else 0 end) sum6,

    sum(case when coid=7 then orderamount else 0 end) sum7,

    sum(case when coid=8 then orderamount else 0 end) sum8,

    sum(case when coid=9 then orderamount else 0 end) sum9,

    sum(case when coid=10 then orderamount else 0 end) sum10,

    sum(case when coid=11 then orderamount else 0 end) sum1,

    sum(case when coid=12 then orderamount else 0 end) sum2,

    sum(case when coid=13 then orderamount else 0 end) sum3,

    sum(case when coid=14 then orderamount else 0 end) sum4,

    sum(case when coid=15 then orderamount else 0 end) sum5,

    sum(case when coid=16 then orderamount else 0 end) sum6,

    sum(case when coid=17 then orderamount else 0 end) sum7,

    sum(case when coid=18 then orderamount else 0 end) sum8,

    sum(case when coid=19 then orderamount else 0 end) sum9,

    sum(case when coid=20 then orderamount else 0 end) sum10

    from #testpivot

    group by prodid

    select 'case method',datediff(ms,@g,getdate()),'coID','prodid+amt'

    --pivot

    go

    DBCC FREEPROCCACHE

    declare @g datetime

    select @g=getdate()

    SELECT

    [1],[2] ,[3] ,[4],[5],[6],[7],[8],[9],[10],

    [11],[12] ,[13] ,[14],[15],[16],[17],[18],[19],[20]

    FROM (select coid,prodid,orderamount from #testpivot) t

    PIVOT (sum(orderamount) FOR coid IN ([1], [2],

    [3],[4],[5],[6],[7],[8],[9],[10],[11], [12],

    [13],[14],[15],[16],[17],[18],[19],[20])

    ) AS OrdProductPivot

    select 'pivot method',datediff(ms,@g,getdate()),'coID','prodid+amt'

    drop index #testpivot.t_pivot

    go

    Now...for the results:

    10M records, 20 companies, 50 products

    06 case method38513no index

    01 pivot method36500no index

    07 case method39110prodidcoid+amt

    10 pivot method41123prodidcoid+amt

    05 case method37750prodid+coidamt

    09 pivot method40766prodid+coidamt

    03 case method37283coID+prodidamt

    04 pivot method37393coID+prodidamt

    08 case method39479coIDprodid+amt

    02 pivot method36596coIDprodid+amt

    So - long story to say - they both BASICALLY run the same (statistically a dead heat I'd say), with the two best times being returned by the NEW method. The right clustered index helps, but most "helper" indexes don't help - they actually make performance worse.

    For the record - I'm also running them now without a clustered index that's helpful...we shall see what we get.

    ----------------------------------------------------------------------------------
    Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?

  • Without a "useful" clustered index (i.e. using rowID as the clustered primary key), here are the results:

    05 case method38563no index

    08 pivot method41030no index

    06 case method39186prodidcoid+amt

    10 pivot method41420prodidcoid+amt

    01 case method36216prodid+coidamt

    09 pivot method41110prodid+coidamt

    03 case method38186coID+prodidamt

    02 pivot method36580coID+prodidamt

    03 case method38186coIDprodid+amt

    07 pivot method40673coIDprodid+amt

    So - again - the indexing doesn't seem to do all that much and the results for both method are essentially tied.

    ----------------------------------------------------------------------------------
    Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?

Viewing 15 posts - 16 through 30 (of 46 total)

You must be logged in to reply to this topic. Login to reply