Window function on different date ranges

  • Hi,

    I have a question regarding windowing functions. I have a sales order table with the columns "orderid", "customerid", "order_date" and "amount". I use the following query to get the amount of every customer as a additional column:

    Select customerid,

    orderid,

    order_date,

    amount,

    SUM(amount) OVER (PARTITION BY customerid)

    FROM sales_orders

    My question is if there is a good way to add another column, which includes the SUM(amount) of the customerid, where the order_date > 2012-01-15 , something like this:

    Select customerid,

    orderid,

    order_date,

    amount,

    SUM(amount) OVER (PARTITION BY customerid),

    SUM(amount) OVER (PARTITION BY customerid WHERE order_date > 2012-01-15)

    FROM sales_orders

    I know, this is not a valid method, so do you know a way to achieve this? Can I maybe use CROSS APPLY or something like this? I know that I could use a subquery to get this, but is there maybe a way / a better way via window functions?

    Best regards!

  • i think a field with a SUM+CASE statement would do the job:

    SUM(CASE WHEN order_date > 2012-01-15 THEN amount ELSE 0 END ) OVER (PARTITION BY customerid )

    --this passes 2012 syntax

    Select customerid,

    orderid,

    order_date,

    amount,

    SUM(amount) OVER (PARTITION BY customerid),

    SUM(CASE WHEN order_date > 2012-01-15 THEN amount ELSE 0 END ) OVER (PARTITION BY customerid )

    FROM sales_orders

    Lowell


    --help us help you! If you post a question, make sure you include a CREATE TABLE... statement and INSERT INTO... statement into that table to give the volunteers here representative data. with your description of the problem, we can provide a tested, verifiable solution to your question! asking the question the right way gets you a tested answer the fastest way possible!

  • Lowell has provided you a perfectly good solution to your question. However, you may want to also consider something like the following for performance reasons:

    WITH PreAggregate AS

    (

    SELECT customerid

    ,sa1=SUM(amount)

    ,sa2=SUM(CASE WHEN order_date > '2012-01-15' THEN amount ELSE 0 END)

    FROM sales_orders

    GROUP BY customerid

    )

    Select customerid, orderid, order_date, amount, sa1, sa2

    FROM sales_orders a

    JOIN PreAggregate b ON a.customerid = b.customerid;

    Reference: The Performance of the T-SQL Window Functions [/url]

    Edit: Lowell, please pardon me for originally suggesting that your fine solution was the work of Luis. I didn't get Luis's post until just now. [face-slap]/face-slap]


    My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]

    My thought question: Have you ever been told that your query runs too fast?

    My advice:
    INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
    The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

    Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
    Since random numbers are too important to be left to chance, let's generate some![/url]
    Learn to understand recursive CTEs by example.[/url]
    [url url=http://www.sqlservercentral.com/articles/St

  • dwain.c (2/5/2014)


    Luis has provided you a perfectly good solution to your question.

    I managed to get into your head. I hope you won't have nightmares :hehe:

    Luis C.
    General Disclaimer:
    Are you seriously taking the advice and code from someone from the internet without testing it? Do you at least understand it? Or can it easily kill your server?

    How to post data/code on a forum to get the best help: Option 1 / Option 2
  • Luis Cazares (2/5/2014)


    dwain.c (2/5/2014)


    Luis has provided you a perfectly good solution to your question.

    I managed to get into your head. I hope you won't have nightmares :hehe:

    Only if I start hearing Zombie by the Cranberries playing over and over in my head.


    My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]

    My thought question: Have you ever been told that your query runs too fast?

    My advice:
    INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
    The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

    Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
    Since random numbers are too important to be left to chance, let's generate some![/url]
    Learn to understand recursive CTEs by example.[/url]
    [url url=http://www.sqlservercentral.com/articles/St

  • Wow, thats amazing, thanks alot! I never thought of using a CASE-Statement in a SUM() function. That helped me very much!

    Is there also a way to use the FIRST_VALUE(order_date) window function this way (to show the first order_date after the 2012-01-15 and another column, which shows the first order_date without a date restriction for every customer)?

    Best regards!

  • vip.blade (2/6/2014)


    Wow, thats amazing, thanks alot! I never thought of using a CASE-Statement in a SUM() function. That helped me very much!

    Is there also a way to use the FIRST_VALUE(order_date) window function this way (to show the first order_date after the 2012-01-15 and another column, which shows the first order_date without a date restriction for every customer)?

    Best regards!

    That does sound possible to me. I'd need some DDL and consumable sample data if you want a tested bit of code for it though.

    [This is one of the very few times I've ever posted a solution without DDL and sample data to test it with]


    My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]

    My thought question: Have you ever been told that your query runs too fast?

    My advice:
    INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
    The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

    Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
    Since random numbers are too important to be left to chance, let's generate some![/url]
    Learn to understand recursive CTEs by example.[/url]
    [url url=http://www.sqlservercentral.com/articles/St

  • dwain.c (2/6/2014)


    vip.blade (2/6/2014)


    Wow, thats amazing, thanks alot! I never thought of using a CASE-Statement in a SUM() function. That helped me very much!

    Is there also a way to use the FIRST_VALUE(order_date) window function this way (to show the first order_date after the 2012-01-15 and another column, which shows the first order_date without a date restriction for every customer)?

    Best regards!

    That does sound possible to me. I'd need some DDL and consumable sample data if you want a tested bit of code for it though.

    [This is one of the very few times I've ever posted a solution without DDL and sample data to test it with]

    I use the [AdventureWorks2012].[Sales].[SalesOrderHeader] table for my tests:

    SELECT [CustomerID]

    ,SalesOrderID

    ,[OrderDate]

    ,[TotalDue]

    ,SUM(CASE WHEN CAST([OrderDate] AS DATE) > '2008-02-20' THEN [TotalDue] ELSE 0 END) OVER (PARTITION BY CustomerID)

    FROM [AdventureWorks2012].[Sales].[SalesOrderHeader]

    Do you have this test database or should I provide you some more information?

    Best regards and thanks alot!

  • Sorry but I don't have/use Adventure Works.


    My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]

    My thought question: Have you ever been told that your query runs too fast?

    My advice:
    INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
    The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

    Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
    Since random numbers are too important to be left to chance, let's generate some![/url]
    Learn to understand recursive CTEs by example.[/url]
    [url url=http://www.sqlservercentral.com/articles/St

  • Ok, so here is the DDL and some sample data:

    CREATE TABLE [Sales].[SalesOrderHeader](

    [SalesOrderID] [int],

    [OrderDate] [datetime] NOT NULL,

    [CustomerID] [int] NOT NULL,

    [TotalDue] [Numeric](10,2),

    CONSTRAINT [PK_SalesOrderHeader_SalesOrderID] PRIMARY KEY CLUSTERED

    (

    [SalesOrderID] ASC

    )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]

    ) ON [PRIMARY]

    GO

    INSERT INTO [AdventureWorks2012].[Sales].[SalesOrderHeader] ([SalesOrderID]

    ,[OrderDate]

    ,[CustomerID]

    ,[TotalDue])

    VALUES ('43659','2005-07-01 00:00:00.000', '29825', 23153.23),

    ('43660','2005-07-02 00:00:00.000', '29825', 23111.82),

    ('43661','2013-08-02 00:00:00.000', '29825', 11.82),

    ('43662','2014-01-05 00:00:00.000', '29825', 111.22),

    ('43663','2007-05-05 00:00:00.000', '29826', 541.22),

    ('43664','2003-12-12 00:00:00.000', '29826', 1641.22),

    ('43665','2009-08-12 00:00:00.000', '29826', 11641.22)

    If you need more sample data, please let me know.

    Best regards and thanks!

  • You can use FIRST_VALUE for the first case or MIN for either case.

    WITH PreAggregate AS

    (

    SELECT CustomerID, OrderDate=MIN(OrderDate)

    --FROM #SalesOrderHeader

    FROM [Sales].[SalesOrderHeader]

    WHERE OrderDate > '2012-01-15'

    GROUP BY CustomerID

    )

    SELECT *

    ,EarliestOrderDate1=FIRST_VALUE(a.OrderDate) OVER(PARTITION BY a.CustomerID ORDER BY a.OrderDate)

    ,EarliestOrderDate1=MIN(a.OrderDate) OVER(PARTITION BY a.CustomerID ORDER BY a.OrderDate)

    ,OrderDate1=MIN(CASE WHEN a.OrderDate > '2012-01-15' THEN a.OrderDate END) OVER(PARTITION BY a.CustomerID ORDER BY a.OrderDate)

    ,OrderDate2=b.OrderDate

    --FROM #SalesOrderHeader a

    FROM [Sales].[SalesOrderHeader]

    JOIN PreAggregate b ON a.CustomerID = b.CustomerID

    ORDER BY a.CustomerID, a.OrderDate;

    However if you need to apply the order date > 2012-01-15 to all rows, I believe you'll need to use the pre-aggregate.


    My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]

    My thought question: Have you ever been told that your query runs too fast?

    My advice:
    INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
    The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

    Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
    Since random numbers are too important to be left to chance, let's generate some![/url]
    Learn to understand recursive CTEs by example.[/url]
    [url url=http://www.sqlservercentral.com/articles/St

  • Thank you very much, it works like a charm! That helped me a lot and I learned something new - as always when I open a topic in this forum. 🙂

Viewing 12 posts - 1 through 11 (of 11 total)

You must be logged in to reply to this topic. Login to reply