Case when with a Pivot

  • Hello,

    I'm trying to write a AR aging select query that  uses a case statement to indicate if the invoice due date is 30, 60, 90 over due etc.

    I need to then use the pivot function to get columns as  30,60, 90  , the problem I'm facing is the pivot doesn't seem to recognize the  Case  column name.  I'm not sure I'm doing it the best way, but this is what I have so far.

    select  [amount]
    from

    (select
    sum(amount) as Total
    --, ROW_NUMBER() over (partition by due order by due_date) as nrow
    , case when DATEDIFF(d, due_date, getdate()) between 1 and 30 then 1-30
    when DATEDIFF(d, due_date, getdate()) between 31 and 60 then 31-60
    when DATEDIFF(d, due_date, getdate()) between 61 and 90 then 61-90
    when DATEDIFF(d, due_date, getdate()) > 90 then 100
    else 0 end as Due

    FROM [EES_App].[dbo].[artran_all]
    where Cust_num like N'%5687'
    -- and type <> 'P'
    and due_date >= '2020-03-001 00:00:00.000'
    group by amount,due_date) Temp

    Pivot
    (
    Max(amount)
    for due in ('1-30','31-60','61-90',100)
    ) piv

    I would like to get it in a format like this

    table

    Thanks for any help.

     

  • Since you attempt at code and what you posted as your desired final output have several difference between the two, I split the difference and can derived the following code.  I believe that you're also missing a bit of criteria that separates paid from unpaid amounts due (and a date isn't going to do that for you) but I'll let you science that simple WHERE clause out.

    Here's the code.

     SELECT  Cust_num
    ,Co_num
    ,Total = SUM(amount)
    ,[1-30 Days] = SUM(CASE WHEN DATEDIFF(dd,due_date,GETDATE())+1 BETWEEN 1 AND 30 THEN amount ELSE 0 END)
    ,[31-60 Days] = SUM(CASE WHEN DATEDIFF(dd,due_date,GETDATE())+1 BETWEEN 11 AND 60 THEN amount ELSE 0 END)
    ,[61-90 Days] = SUM(CASE WHEN DATEDIFF(dd,due_date,GETDATE())+1 BETWEEN 61 AND 90 THEN amount ELSE 0 END)
    ,[>90 Days] = SUM(CASE WHEN DATEDIFF(dd,due_date,GETDATE())+1 > 90 THEN amount ELSE 0 END)
    FROM EES_App.dbo.artran_all
    GROUP BY Cust_num, Co_num
    ;

    As bit of a sidebar, I tend to avoid PIVOT because it's usually slower than the above (which is known as a "Classic CROSSTAB") and it's easier to add things like the Total column, and easier to control different ranges of data, like this problem requires.

    If you end up with a LOT of data, pre-aggregating the sums before you do the SUM/CASE pivots will make it faster still.  Please see the following "Black Arts" article on the subject for more information on this ancient but incredibly useful methodology.

    https://www.sqlservercentral.com/articles/cross-tabs-and-pivots-part-1-%e2%80%93-converting-rows-to-columns-1

    Remembering what you posted in your previous post, you really need to do something about getting rid of the leading spaces in the Cust_num column.  Your best bet would be to convert it to an INT datatype IF all the Cust_num's are supposed to be digits-only.

     

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • select
    Cust_num, Co_num,
    sum(amount) Total,
    sum(iif(floor(datediff(d,due_date,getdate())/30)=0,amount,0)) [1-30 Days],
    sum(iif(floor(datediff(d,due_date,getdate())/30)=1,amount,0)) [31-60 Days],
    sum(iif(floor(datediff(d,due_date,getdate())/30)=2,amount,0)) [61-90 Days],
    sum(iif(floor(datediff(d,due_date,getdate())/30)>2,amount,0)) [>90 Days]
    from EES_App.dbo.artran_all
    group by Cust_num, Co_num;

    Aus dem Paradies, das Cantor uns geschaffen, soll uns niemand vertreiben können

  • Steve Collins wrote:

    select
    Cust_num, Co_num,
    sum(amount) Total,
    sum(iif(floor(datediff(d,due_date,getdate())/30)=0,amount,0)) [1-30 Days],
    sum(iif(floor(datediff(d,due_date,getdate())/30)=1,amount,0)) [31-60 Days],
    sum(iif(floor(datediff(d,due_date,getdate())/30)=2,amount,0)) [61-90 Days],
    sum(iif(floor(datediff(d,due_date,getdate())/30)>2,amount,0)) [>90 Days]
    from EES_App.dbo.artran_all
    group by Cust_num, Co_num;

    Since integer division is involved in your good code, you shouldn't need to do the FLOOR.  IIF still resolves to CASE statements behind the scenes but it does shorten up the code.  Nicely done.

    This would work very well for using pre-aggregation prior to the CROSSTAB for improved performance, as well.  Pre-aggregation does make the code longer but it (usually) makes it about twice as fast.  With the understanding that, in the absence of any test data, I've not tested this (or the previous) code, here's what I'm talking about for pre-aggregation.  Again, it looks like it should be slower because of the two GROUP BYs but usually results in twice the performance for both CROSSTABs and (ugh!) PIVOTs.

    WITH 
    ctePreAgg AS
    (
    SELECT Cust_num,Co_num
    ,DaysGroup = DATEDIFF(dd,due_date,GETDATE())/30
    ,DaysSum = SUM(amount)
    FROM EES_App.dbo.artran_all
    GROUP BY Cust_num, Co_num, DATEDIFF(dd,due_date,GETDATE())/30
    )
    SELECT Cust_num, Co_num
    ,Total = SUM(DaysSum)
    ,[1-30 Days] = SUM(IIF(DaysGroup = 0,DaysSum,0))
    ,[31-60 Days] = SUM(IIF(DaysGroup = 1,DaysSum,0))
    ,[61-90 Days] = SUM(IIF(DaysGroup = 2,DaysSum,0))
    ,[>90 Days] = SUM(IIF(DaysGroup > 2,DaysSum,0))
    FROM ctePreAgg
    GROUP BY Cust_num, Co_num
    ;

    And, thank you very much for reminding me about IIF.  While it's no faster than CASE statements, it IS a great visual shortcut for readability.  I've been working with old databases for so long that I usually forget about it even if I know I'm working on a later version of SQL Server that supports its.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Very very nice pre-aggregation!  Now thaaaat's the way to do it.

    Aus dem Paradies, das Cantor uns geschaffen, soll uns niemand vertreiben können

  • In both solutions - I would move the calculation to a CROSS APPLY:

    SELECT  Cust_num
    ,Co_num
    ,Total = SUM(amount)
    ,[1-30 Days] = SUM(CASE WHEN d.Days BETWEEN 1 AND 30 THEN amount ELSE 0 END)
    ,[31-60 Days] = SUM(CASE WHEN d.Days BETWEEN 11 AND 60 THEN amount ELSE 0 END)
    ,[61-90 Days] = SUM(CASE WHEN d.Days BETWEEN 61 AND 90 THEN amount ELSE 0 END)
    ,[>90 Days] = SUM(CASE WHEN d.Days > 90 THEN amount ELSE 0 END)
    FROM EES_App.dbo.artran_all
    CROSS APPLY (VALUES (DATEDIFF(dd,due_date,GETDATE())+1)) AS d(Days)
    GROUP BY Cust_num, Co_num

     

    Jeffrey Williams
    “We are all faced with a series of great opportunities brilliantly disguised as impossible situations.”

    ― Charles R. Swindoll

    How to post questions to get better answers faster
    Managing Transaction Logs

  • Steve Collins wrote:

    Very very nice pre-aggregation!  Now thaaaat's the way to do it.

    It's not just me... we did it together.  You reminded me not only of the IIF function but also of the advantages of integer division.  Team effort.  That's why I love this site and appreciate good folks like yourself.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeffrey Williams wrote:

    In both solutions - I would move the calculation to a CROSS APPLY:

    SELECT  Cust_num
    ,Co_num
    ,Total = SUM(amount)
    ,[1-30 Days] = SUM(CASE WHEN d.Days BETWEEN 1 AND 30 THEN amount ELSE 0 END)
    ,[31-60 Days] = SUM(CASE WHEN d.Days BETWEEN 11 AND 60 THEN amount ELSE 0 END)
    ,[61-90 Days] = SUM(CASE WHEN d.Days BETWEEN 61 AND 90 THEN amount ELSE 0 END)
    ,[>90 Days] = SUM(CASE WHEN d.Days > 90 THEN amount ELSE 0 END)
    FROM EES_App.dbo.artran_all
    CROSS APPLY (VALUES (DATEDIFF(dd,due_date,GETDATE())+1)) AS d(Days)
    GROUP BY Cust_num, Co_num

    That's the 4th different solution posted.  I smell a million row test coming on. 🙂

    What I need to know from the OP is... can a Cust_num have more that one Co_num or not?  What say thee, @gjoelson ???

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • This puts it all together in 2 styles.

    /* Jeff's style */
    SELECT Cust_num, Co_num
    ,Total = SUM(amount)
    ,[1-30 Days] = SUM(IIF(DaysGroup = 0,amount,0))
    ,[31-60 Days] = SUM(IIF(DaysGroup = 1,amount,0))
    ,[61-90 Days] = SUM(IIF(DaysGroup = 2,amount,0))
    ,[>90 Days] = SUM(IIF(DaysGroup = 3,amount,0))
    FROM EES_App.dbo.artran_all
    CROSS APPLY (VALUES (DATEDIFF(dd,due_date,GETDATE())/30)) AS d(DaysGroup)
    GROUP BY Cust_num, Co_num
    ;

    /* Steve's style (scdecade) */
    select
    Cust_num, Co_num,
    sum(a.amount) Total,
    sum(iif(d.days_group=0,amount,0)) [1-30 Days],
    sum(iif(d.days_group=1,amount,0)) [31-60 Days],
    sum(iif(d.days_group=2,amount,0)) [61-90 Days],
    sum(iif(d.days_group>2,amount,0)) [>90 Days]
    from EES_App.dbo.artran_all a
    cross apply (values (datediff(d,a.due_date,getdate())/30)) as d(days_group)
    group by Cust_num, Co_num;

    Aus dem Paradies, das Cantor uns geschaffen, soll uns niemand vertreiben können

  • Bloody amazing machines nowadays.  The laptop I'm currently using has the following specs...

    Processor Intel(R) Core(TM) i7-8750H CPU @ 2.20GHz, 2208 Mhz, 6 Core(s), 12 Logical Processor(s) (w/automatic Overclock up to 4GhZ)

    32GB RAM

    2TB NVME SSD

    1TB SATA spinning rust.

    I don't have the time to post all of the code I tested with but my findings are that the Pre-Aggregation didn't matter much (regardless of method)  in this case and IIF was a 100-200 ms faster than CASE/BETWEEN.  FLOOR cost virtually nothing (in this case) but wasn't necessary.  The use of IIF and the CROSS APPLY greatly simplified the code.

    Thank you both (Steve/scdecade and Jeffrey Williams)  for playing.

    Oh.. almost forgot.  Here's the test table code I used (details in the code).  I did it in TempDB and as a real table on my SSD Driven and on my spinning rust.  It's currently setup for TempDB.  It uses my "fnTally" function as a "Pseudo-Cursor" row source to replace RBAR methods of generating rows of random but constrained test data.  That function can be found at the similarly named link in my signature line below or you can use one of your own methods.  The code we've all written returns about 350K rows from this and does so in about 5.1 seconds or so.

    --===== If the test table already exists, drop it to make reruns in SSMS easier.
    IF OBJECT_ID('tempdb..#TestTable','U') IS NOT NULL DROP TABLE #TestTable
    ;
    GO
    --===== Create and populate a test table. This is not a part of the solution.
    -- Don''t let the 5 million rows scare you. It takes less than 2 seconds on a decent box.
    SELECT Cust_num = CHAR(ABS(CHECKSUM(NEWID())%26)+65) --17,576 Customer "numbers" from "AAA" to "ZZZ"
    + CHAR(ABS(CHECKSUM(NEWID())%26)+65)
    + CHAR(ABS(CHECKSUM(NEWID())%26)+65)
    ,Co_num = ABS(CHECKSUM(NEWID())%20)+1200 --1200 to 1219
    ,amount = CONVERT(DECIMAL(9,2),RAND(CHECKSUM(NEWID()))*1000) --0.00 to 1,000.00 possible due to rounding
    ,due_date = RAND(CHECKSUM(NEWID()))*DATEDIFF(dd,'2015',GETDATE())+CONVERT(DATETIME,'2015') --01 Jan 2015 up to NOW with times
    INTO #TestTable
    FROM dbo.fnTally(1,5000000)
    ;
    CHECKPOINT;
    GO

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff/Steve a big thank you.

    I certainly appreciate all you guys knowledge, its a treat reading the back and forth and learning from the best.

    I love the crosstab option, 1st time seeing it...and definitely  easier to do the aggregation and totals.

    I've used the pivot on a few occasions and it always presented challenges and seemed slow and clunky, probably because I never did any pre-aggregating it was all mixed in.

    I will be getting into that "Black Arts" article for sure - thank you.

    To answer the question "can a Cust_num have more that one Co_num or not?"  yes,  in our case customers can have multiple Co_num , so need to figure a way to separate those out.

    again thank you.

  • gjoelson 29755 wrote:

    To answer the question "can a Cust_num have more that one Co_num or not?"  yes,  in our case customers can have multiple Co_num , so need to figure a way to separate those out.

    ABD!  (Already Been Done).  The 5 million row test and all the code we've written as a group handles that.  The only thing not done is a final rollup by Cust_Num, which wasn't included in the original problem.

    And, thank you very much for the feedback.  Don't forget the other "Jeff"... he's the one that came up with the CROSS APPLY.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • For completeness sake - here is the query using PIVOT:

     --==== Using PIVOT and datediff / 30 to determine category
    Select pvt.Cust_num
    , pvt.Co_num
    , pvt.TotalAmount
    , pvt.[1-30 Days]
    , pvt.[31-60 Days]
    , pvt.[61-90 Days]
    , pvt.[>90 Days]
    From (Select aa.Cust_num
    , aa.Co_num
    , TotalAmount = sum(aa.amount) over(Partition By aa.Cust_num, aa.Co_num)
    , Category = Case d.Days / 30
    When 0 Then '1-30 Days'
    When 1 Then '31-60 Days'
    When 2 Then '61-90 Days'
    Else '>90 Days'
    End
    , aa.amount
    From @artran_all aa
    Cross Apply (Values (datediff(day, getdate(), aa.due_date))) As d(Days)
    ) As x
    Pivot (sum(x.amount) For Category In ([1-30 Days], [31-60 Days], [61-90 Days], [>90 Days])) As pvt;

    --==== Using PIVOT and days BETWEEN to determine category
    Select pvt.Cust_num
    , pvt.Co_num
    , pvt.TotalAmount
    , pvt.[1-30 Days]
    , pvt.[31-60 Days]
    , pvt.[61-90 Days]
    , pvt.[>90 Days]
    From (Select aa.Cust_num
    , aa.Co_num
    , TotalAmount = sum(aa.amount) over(Partition By aa.Cust_num, aa.Co_num)
    , Category = Case When d.Days Between 1 And 30 Then '1-30 Days'
    When d.Days Between 31 And 60 Then '31-60 Days'
    When d.Days Between 61 And 90 Then '61-90 Days'
    Else '>90 Days'
    End
    , aa.amount
    From @artran_all aa
    Cross Apply (Values (datediff(day, getdate(), aa.due_date) + 1)) As d(Days)
    ) As x
    Pivot (sum(x.amount) For Category In ([1-30 Days], [31-60 Days], [61-90 Days], [>90 Days])) As pvt;

    @JeffModen - I don't believe the IIF is the difference in performance since that will be converted to a CASE expression in the execution plan anyways.  The real difference is the BETWEEN check - rather than the simpler (faster?) math...and the FLOOR is not necessary for integer math (which you already know).

    As you can see - the PIVOT code is much more verbose to get to the same results.

     

    Jeffrey Williams
    “We are all faced with a series of great opportunities brilliantly disguised as impossible situations.”

    ― Charles R. Swindoll

    How to post questions to get better answers faster
    Managing Transaction Logs

  • That's why I don't use PIVOT code.  You should see the trainwreck when you need more than one PIVOT in the same query (for example: Qty and Amount for each month).

    And, yes... I understand that IIF resolves to a CASE statement (I was actually the first to say so on this thread).  You're correct though... it may be the fact that BETWEEN was replace with simpler integer math that did the trick.

    The really fun part about this thread is how we collectively ended up with a very small and easy to read query that runs nasty fast.  I love this community.  Thanks for being a part of it, Jeffrey.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 14 posts - 1 through 13 (of 13 total)

You must be logged in to reply to this topic. Login to reply