Date Ranges - Simple, but seemingly impossible

  • Hi all,

    I've done a lot of research and tried so many things for around a week. I'm desperate at this point, so I'm asking for your help.

    I have this ProductDiscounts table:

    --------------------Scripts for creating the table:---------------------
    CREATE TABLE ProductDiscounts (
    Stack VARCHAR (255),
    ProductCode VARCHAR (255),
    ValidFrom DATE,
    ValidTo DATE,
    Discount1 decimal(4,3))
    ;

    INSERT INTO ProductDiscounts
    (Stack, ProductCode, ValidFrom, ValidTo, Discount1)
    VALUES
    ('A','PROD40','20000101','20401231',0.1),
    ('B','PROD40','20090201','20110131',0.2),
    ('C','PROD40','20110201','20411231',0.15),
    ('D','PROD40','20090201','20401231',0.3)
    ;

    -----------------------------------------------------------------------------------

    Here's a look at the timeframe:

    Objective: I need to create a query to slice these timeframes in a correct way to be able to stack them up so that I can later use them to calculate the discounts that will be applied to a certain day.

    In terms of timeframe (for a better understanding) the final result should be like this:

    So the final result should be:

    As you can see, I'm able to Stack the discounts into common timeframes. This is the result I need and I cannot reach no matter what I try. There's always something not correct.

    Here's what I got so far:
    I have a query to slice the dates using ONLY the ValidFrom and ValidTo from the first table:

    But here's the problem: the dates overlap.
    For example, if I have a sale on the 2011-01-31, this date is present in line 2, 3 and 4 but it is also present in the line 5 and 6, and it should only be present in 2, 3 and 4, because I need to apply the discounts from Stack 2, 3 and 4 and not 5 and 6.

    ------------------Here's the query I have so far to achieve the result right above:------------------
    SELECT PD.[Stack]
        ,DR.[ProductCode]
        ,DR.[ValidFrom]
        ,DR.[ValidTo]
        ,PD.[Discount1]
    FROM (
        SELECT [Stack]
            ,[ProductCode]
            ,[ValidFrom]
            ,LEAD([ValidFrom]) OVER (PARTITION BY ProductCode ORDER BY [ValidFrom]) AS [ValidTo]
            ,[Discount1]
        FROM (
            SELECT [Stack]
                ,[ProductCode]
                ,[ValidFrom]
                ,[Discount1]
            FROM [dbo].[ProductDiscounts]
            
            UNION
            
            SELECT [Stack]
                ,[ProductCode]
                ,[ValidTo]
                ,[Discount1]
            FROM [dbo].[ProductDiscounts]
            ) Interceptions
        ) DR
    INNER JOIN [dbo].[ProductDiscounts] PD ON DR.ProductCode = PD.ProductCode
        AND DR.[ValidFrom] BETWEEN PD.[ValidFrom] AND PD.[ValidTo]
        AND DR.[ValidTo] BETWEEN PD.[ValidFrom]    AND PD.[ValidTo]
    WHERE DATEDIFF(DAY, DR.[ValidFrom], DR.[ValidTo]) > 0
    ORDER BY DR.[ValidFrom] ASC,PD.[Stack] ASC

    -----------------------------------------------------------------------------------------------------------------

    Any idea on how to achieve the final result? I can't thank enough if someone tries to help me.

    PS: Many Thanks to Thom A and drew.allen for helping out on a similar problem which helped me get to the query above that I have right now. Although when I applied it to the real database, I found the problems leading to this post and now I'm trying to figure out how to solve them and am asking once again for the help of the experts of this forum.

  • This problem is straight forward if you realize what is missing in the set which is the date range property.
    😎

    Here is a quick solution which generates an "inline calendar" to create the ranges

    USE [TEEST]
    GO

    SET NOCOUNT ON;

    IF OBJECT_ID('dbo.ProductDiscounts') IS NOT NULL DROP TABLE dbo.ProductDiscounts;

    CREATE TABLE dbo.ProductDiscounts
    (
      Stack   VARCHAR (5)
     ,ProductCode VARCHAR (10)
     ,ValidFrom DATE
     ,ValidTo  DATE
     ,Discount1 DECIMAL(4,3)
    )
    ;

    INSERT INTO dbo.ProductDiscounts (Stack, ProductCode, ValidFrom, ValidTo, Discount1)
    VALUES
    ('A','PROD40','20000101','20401231',0.1),
    ('B','PROD40','20090201','20110131',0.2),
    ('C','PROD40','20110201','20411231',0.15),
    ('D','PROD40','20090201','20401231',0.3)

    ;WITH DATE_SET AS
    (
      SELECT DISTINCT
       PRD.TOFROM
      ,PRD.VDATE
      FROM  dbo.ProductDiscounts  PD
      CROSS APPLY
       (
        SELECT 1,PD.ValidFrom UNION ALL
        SELECT 0,PD.ValidTo
       ) PRD(TOFROM,VDATE)
    )
    ,PERIOD_SET AS
    (
      SELECT
       DS.TOFROM
       ,DS.VDATE        AS FROM_DATE      
       ,LEAD(DS.VDATE,1,NULL) OVER
        (
          ORDER BY DS.VDATE ASC
        )         AS TO_DATE
      FROM  DATE_SET  DS
    )
    ,FILTERED_PERIODS AS
    (
      SELECT
       PS.TOFROM
      ,PS.FROM_DATE
      ,PS.TO_DATE
      FROM   PERIOD_SET     PS
      WHERE PS.TO_DATE IS NOT NULL
    )
    ,DATE_RANGE AS
    (
      SELECT
       MIN(FP.FROM_DATE) AS MIN_DATE
       ,MAX(FP.TO_DATE) AS MAX_DATE
       ,DATEDIFF(MONTH,MIN(FP.FROM_DATE),MAX(FP.TO_DATE)) AS NUM_YEAR
      FROM  FILTERED_PERIODS  FP
    )
    ,T(N) AS (SELECT X.N FROM (VALUES (0),(0),(0),(0),(0),(0),(0),(0),(0),(0)) X(N))
    , NUMS(N) AS (SELECT TOP((SELECT DR.NUM_YEAR FROM DATE_RANGE DR)) ROW_NUMBER() OVER (ORDER BY @@VERSION) AS N
         FROM DATE_RANGE DR,T T1,T T2,T T3)
    SELECT DISTINCT
    PS.Stack
    ,PS.ProductCode
    ,FP.FROM_DATE
    ,FP.TO_DATE
    ,PS.Discount1      
    FROM    DATE_RANGE DR
    CROSS APPLY  NUMS  NM
    CROSS APPLY  dbo.ProductDiscounts  PS
    CROSS APPLY  FILTERED_PERIODS   FP
    WHERE DATEADD(MONTH, NM.N, DR.MIN_DATE) > FP.FROM_DATE
    AND  DATEADD(MONTH, NM.N, DR.MIN_DATE) < FP.TO_DATE
    AND  DATEADD(MONTH, NM.N, DR.MIN_DATE) > PS.ValidFrom
    AND  DATEADD(MONTH, NM.N, DR.MIN_DATE) < PS.ValidTo
    ORDER BY FP.FROM_DATE ASC
    ;

    Output

    Stack ProductCode FROM_DATE  TO_DATE    Discount1
    ----- ----------- ---------- ---------- ------------------------
    A     PROD40      2000-01-01 2009-02-01 0.100
    A     PROD40      2009-02-01 2011-01-31 0.100
    B     PROD40      2009-02-01 2011-01-31 0.200
    D     PROD40      2009-02-01 2011-01-31 0.300
    A     PROD40      2011-02-01 2040-12-31 0.100
    C     PROD40      2011-02-01 2040-12-31 0.150
    D     PROD40      2011-02-01 2040-12-31 0.300
    C     PROD40      2040-12-31 2041-12-31 0.150

  • Eirikur Eiriksson - Sunday, November 5, 2017 2:02 AM

    This problem is straight forward if you realize what is missing in the set which is the date range property.
    😎

    Here is a quick solution which generates an "inline calendar" to create the ranges

    USE [TEEST]
    GO

    SET NOCOUNT ON;

    IF OBJECT_ID('dbo.ProductDiscounts') IS NOT NULL DROP TABLE dbo.ProductDiscounts;

    CREATE TABLE dbo.ProductDiscounts
    (
      Stack   VARCHAR (5)
     ,ProductCode VARCHAR (10)
     ,ValidFrom DATE
     ,ValidTo  DATE
     ,Discount1 DECIMAL(4,3)
    )
    ;

    INSERT INTO dbo.ProductDiscounts (Stack, ProductCode, ValidFrom, ValidTo, Discount1)
    VALUES
    ('A','PROD40','20000101','20401231',0.1),
    ('B','PROD40','20090201','20110131',0.2),
    ('C','PROD40','20110201','20411231',0.15),
    ('D','PROD40','20090201','20401231',0.3)

    ;WITH DATE_SET AS
    (
      SELECT DISTINCT
       PRD.TOFROM
      ,PRD.VDATE
      FROM  dbo.ProductDiscounts  PD
      CROSS APPLY
       (
        SELECT 1,PD.ValidFrom UNION ALL
        SELECT 0,PD.ValidTo
       ) PRD(TOFROM,VDATE)
    )
    ,PERIOD_SET AS
    (
      SELECT
       DS.TOFROM
       ,DS.VDATE        AS FROM_DATE      
       ,LEAD(DS.VDATE,1,NULL) OVER
        (
          ORDER BY DS.VDATE ASC
        )         AS TO_DATE
      FROM  DATE_SET  DS
    )
    ,FILTERED_PERIODS AS
    (
      SELECT
       PS.TOFROM
      ,PS.FROM_DATE
      ,PS.TO_DATE
      FROM   PERIOD_SET     PS
      WHERE PS.TO_DATE IS NOT NULL
    )
    ,DATE_RANGE AS
    (
      SELECT
       MIN(FP.FROM_DATE) AS MIN_DATE
       ,MAX(FP.TO_DATE) AS MAX_DATE
       ,DATEDIFF(MONTH,MIN(FP.FROM_DATE),MAX(FP.TO_DATE)) AS NUM_YEAR
      FROM  FILTERED_PERIODS  FP
    )
    ,T(N) AS (SELECT X.N FROM (VALUES (0),(0),(0),(0),(0),(0),(0),(0),(0),(0)) X(N))
    , NUMS(N) AS (SELECT TOP((SELECT DR.NUM_YEAR FROM DATE_RANGE DR)) ROW_NUMBER() OVER (ORDER BY @@VERSION) AS N
         FROM DATE_RANGE DR,T T1,T T2,T T3)
    SELECT DISTINCT
    PS.Stack
    ,PS.ProductCode
    ,FP.FROM_DATE
    ,FP.TO_DATE
    ,PS.Discount1      
    FROM    DATE_RANGE DR
    CROSS APPLY  NUMS  NM
    CROSS APPLY  dbo.ProductDiscounts  PS
    CROSS APPLY  FILTERED_PERIODS   FP
    WHERE DATEADD(MONTH, NM.N, DR.MIN_DATE) > FP.FROM_DATE
    AND  DATEADD(MONTH, NM.N, DR.MIN_DATE) < FP.TO_DATE
    AND  DATEADD(MONTH, NM.N, DR.MIN_DATE) > PS.ValidFrom
    AND  DATEADD(MONTH, NM.N, DR.MIN_DATE) < PS.ValidTo
    ORDER BY FP.FROM_DATE ASC
    ;

    Output

    Stack ProductCode FROM_DATE  TO_DATE    Discount1
    ----- ----------- ---------- ---------- ------------------------
    A     PROD40      2000-01-01 2009-02-01 0.100
    A     PROD40      2009-02-01 2011-01-31 0.100
    B     PROD40      2009-02-01 2011-01-31 0.200
    D     PROD40      2009-02-01 2011-01-31 0.300
    A     PROD40      2011-02-01 2040-12-31 0.100
    C     PROD40      2011-02-01 2040-12-31 0.150
    D     PROD40      2011-02-01 2040-12-31 0.300
    C     PROD40      2040-12-31 2041-12-31 0.150

    Thank you very much for your help 🙂 However, some of the dates are still overlapping 🙁

  • Vegeta7 - Sunday, November 5, 2017 8:17 AM

    Eirikur Eiriksson - Sunday, November 5, 2017 2:02 AM

    This problem is straight forward if you realize what is missing in the set which is the date range property.
    😎

    Here is a quick solution which generates an "inline calendar" to create the ranges

    USE [TEEST]
    GO

    SET NOCOUNT ON;

    IF OBJECT_ID('dbo.ProductDiscounts') IS NOT NULL DROP TABLE dbo.ProductDiscounts;

    CREATE TABLE dbo.ProductDiscounts
    (
      Stack   VARCHAR (5)
     ,ProductCode VARCHAR (10)
     ,ValidFrom DATE
     ,ValidTo  DATE
     ,Discount1 DECIMAL(4,3)
    )
    ;

    INSERT INTO dbo.ProductDiscounts (Stack, ProductCode, ValidFrom, ValidTo, Discount1)
    VALUES
    ('A','PROD40','20000101','20401231',0.1),
    ('B','PROD40','20090201','20110131',0.2),
    ('C','PROD40','20110201','20411231',0.15),
    ('D','PROD40','20090201','20401231',0.3)

    ;WITH DATE_SET AS
    (
      SELECT DISTINCT
       PRD.TOFROM
      ,PRD.VDATE
      FROM  dbo.ProductDiscounts  PD
      CROSS APPLY
       (
        SELECT 1,PD.ValidFrom UNION ALL
        SELECT 0,PD.ValidTo
       ) PRD(TOFROM,VDATE)
    )
    ,PERIOD_SET AS
    (
      SELECT
       DS.TOFROM
       ,DS.VDATE        AS FROM_DATE      
       ,LEAD(DS.VDATE,1,NULL) OVER
        (
          ORDER BY DS.VDATE ASC
        )         AS TO_DATE
      FROM  DATE_SET  DS
    )
    ,FILTERED_PERIODS AS
    (
      SELECT
       PS.TOFROM
      ,PS.FROM_DATE
      ,PS.TO_DATE
      FROM   PERIOD_SET     PS
      WHERE PS.TO_DATE IS NOT NULL
    )
    ,DATE_RANGE AS
    (
      SELECT
       MIN(FP.FROM_DATE) AS MIN_DATE
       ,MAX(FP.TO_DATE) AS MAX_DATE
       ,DATEDIFF(MONTH,MIN(FP.FROM_DATE),MAX(FP.TO_DATE)) AS NUM_YEAR
      FROM  FILTERED_PERIODS  FP
    )
    ,T(N) AS (SELECT X.N FROM (VALUES (0),(0),(0),(0),(0),(0),(0),(0),(0),(0)) X(N))
    , NUMS(N) AS (SELECT TOP((SELECT DR.NUM_YEAR FROM DATE_RANGE DR)) ROW_NUMBER() OVER (ORDER BY @@VERSION) AS N
         FROM DATE_RANGE DR,T T1,T T2,T T3)
    SELECT DISTINCT
    PS.Stack
    ,PS.ProductCode
    ,FP.FROM_DATE
    ,FP.TO_DATE
    ,PS.Discount1      
    FROM    DATE_RANGE DR
    CROSS APPLY  NUMS  NM
    CROSS APPLY  dbo.ProductDiscounts  PS
    CROSS APPLY  FILTERED_PERIODS   FP
    WHERE DATEADD(MONTH, NM.N, DR.MIN_DATE) > FP.FROM_DATE
    AND  DATEADD(MONTH, NM.N, DR.MIN_DATE) < FP.TO_DATE
    AND  DATEADD(MONTH, NM.N, DR.MIN_DATE) > PS.ValidFrom
    AND  DATEADD(MONTH, NM.N, DR.MIN_DATE) < PS.ValidTo
    ORDER BY FP.FROM_DATE ASC
    ;

    Output

    Stack ProductCode FROM_DATE  TO_DATE    Discount1
    ----- ----------- ---------- ---------- ------------------------
    A     PROD40      2000-01-01 2009-02-01 0.100
    A     PROD40      2009-02-01 2011-01-31 0.100
    B     PROD40      2009-02-01 2011-01-31 0.200
    D     PROD40      2009-02-01 2011-01-31 0.300
    A     PROD40      2011-02-01 2040-12-31 0.100
    C     PROD40      2011-02-01 2040-12-31 0.150
    D     PROD40      2011-02-01 2040-12-31 0.300
    C     PROD40      2040-12-31 2041-12-31 0.150

    Thank you very much for your help 🙂 However, some of the dates are still overlapping 🙁
    

    I have to ask... Once corrected, what will this result be used for/to do?  The reason why I ask is because, if you're going to do what I think your going to do, then there's a much easier way to do it.  Let us know.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff Moden - Sunday, November 5, 2017 5:52 PM

    I have to ask... Once corrected, what will this result be used for/to do?  The reason why I ask is because, if you're going to do what I think your going to do, then there's a much easier way to do it.  Let us know.

    --Jeff Moden

    Hi Jeff,

    Thanks for taking the time to look into this 🙂 This is the second transformation I need to do. The two next transformations I already know how to achieve them, but so that you can better understand my objective, the end result will be to have a discount fact table, where I will have all the discounts correctly segmented by period. So, I will have something like this:

    In short, I will know all the stacks and respective discounts, ListPrice and FinalPrice for any given period of time. So if a sale were to occur on the 2009-02-01, I would know that I would have to apply those 3 discounts on the ListPrice to get that FinalPrice.

    I'm pretty confident that I will be able to do the two next steps: add the discounts using a pivot I learned from you Jeff (😀) and add the ListPrice also thanks to the help of this forum's experts 🙂 I only need to achieve the result on the date ranges that I'm asking on this thread to make everything else possible 🙂 But as simple as it seems, I can't seem to get there without forcing the result, which will not work on the real data. I need a clean and correct way to make everything dynamic :pinch:

  • Here's an alternative method. The results match your expected output except for the last row which I think has incorrect startdate:
    SELECT pd.Stack, pd.ProductCode, pd.Discount1,
     x.ValidFrom, x.ValidTo
    FROM #ProductDiscounts pd
    CROSS APPLY (
     SELECT ValidFrom = MIN(Date), ValidTo = MAX(Date)
     FROM (
      SELECT
       x.Date, n = (1+ROW_NUMBER() OVER(ORDER BY x.Date))/2 
      FROM #ProductDiscounts i
      CROSS APPLY (
       SELECT d.Date
       FROM (VALUES
        (DATEADD(DAY,-1,i.ValidFrom)),
        (i.ValidFrom),
        (ValidTo),
        (DATEADD(DAY,1,ValidTo))
        ) d (Date)
       WHERE d.Date BETWEEN pd.ValidFrom AND pd.ValidTo
      ) x
      WHERE i.ProductCode = pd.ProductCode
       AND i.ValidFrom <= pd.ValidTo
       AND i.ValidTo >= pd.ValidFrom
      GROUP BY x.Date
     ) d
     GROUP BY n
    ) x
    ORDER BY pd.ProductCode, x.ValidFrom, pd.Stack

    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • ChrisM@Work - Monday, November 6, 2017 6:18 AM

    Here's an alternative method. The results match your expected output except for the last row which I think has incorrect startdate:
    SELECT pd.Stack, pd.ProductCode, pd.Discount1,
     x.ValidFrom, x.ValidTo
    FROM #ProductDiscounts pd
    CROSS APPLY (
     SELECT ValidFrom = MIN(Date), ValidTo = MAX(Date)
     FROM (
      SELECT
       x.Date, n = (1+ROW_NUMBER() OVER(ORDER BY x.Date))/2 
      FROM #ProductDiscounts i
      CROSS APPLY (
       SELECT d.Date
       FROM (VALUES
        (DATEADD(DAY,-1,i.ValidFrom)),
        (i.ValidFrom),
        (ValidTo),
        (DATEADD(DAY,1,ValidTo))
        ) d (Date)
       WHERE d.Date BETWEEN pd.ValidFrom AND pd.ValidTo
      ) x
      WHERE i.ProductCode = pd.ProductCode
       AND i.ValidFrom <= pd.ValidTo
       AND i.ValidTo >= pd.ValidFrom
      GROUP BY x.Date
     ) d
     GROUP BY n
    ) x
    ORDER BY pd.ProductCode, x.ValidFrom, pd.Stack

    Actually, your query matches 100% my expected output! 😀 😀 😀

    My expected output:

    Your output:

    I'm so happy! I will carefully study your query tonight when I get home and adapt it to the real database to test your logic :')

  • Here is an approach that requires one fewer sorts than Chris M.'s method.

    It uses a fairly standard technique for packing intervals developed by Itzik Ben Gan.

    Your data uses a closed interval notation, but this method works best with a half-closed interval notation, so part of the query is used to convert from closed to half-closed and back (the two DATEADD() functions).

    ;
    WITH ValidDates AS
    (
        SELECT ValidDate AS ValidFrom, DATEADD(DAY, -1, LEAD(ValidDate, 1, '2040-12-31') OVER(PARTITION BY ProductCode ORDER BY ValidDate, Stack)) AS ValidTo
        FROM #ProductDiscounts
        CROSS APPLY ( VALUES(ValidFrom), (DATEADD(DAY, 1, ValidTo)) ) dt(ValidDate)
    )
    SELECT pd.Stack, pd.ProductCode, pd.Discount1, vd.ValidFrom, vd.ValidTo
    FROM #ProductDiscounts pd
    INNER JOIN ValidDates vd
        ON pd.ValidFrom < vd.ValidTo
            AND vd.ValidFrom < pd.ValidTo
    WHERE vd.ValidFrom < vd.ValidTo  /*  Exclude invalid date ranges due to "duplicate" dates.  */
    ORDER BY ProductCode, vd.ValidFrom, Stack;

    Drew

    J. Drew Allen
    Business Intelligence Analyst
    Philadelphia, PA

  • What about something like this to avoid creating more rows?

    CREATE TABLE ProductDiscounts (
    Stack VARCHAR (255),
    ProductCode VARCHAR (255),
    ValidFrom DATE,
    ValidTo DATE,
    Discount1 decimal(4,3));

    INSERT INTO ProductDiscounts
    (Stack, ProductCode, ValidFrom, ValidTo, Discount1)
    VALUES
    ('A','PROD40','20000101','20401231',0.1),
    ('B','PROD40','20090201','20110131',0.2),
    ('C','PROD40','20110201','20411231',0.15),
    ('D','PROD40','20090201','20401231',0.3);

    CREATE TABLE Products (
    ProductCode VARCHAR (255),
    ListPrice decimal(10,4));

    INSERT INTO Products VALUES('PROD40', 200);

    WITH cteProductDiscounts AS(
      SELECT *, ROW_NUMBER() OVER(PARTITION BY ProductCode ORDER BY ValidFrom) rn
      FROM ProductDiscounts pd
      WHERE '20090201' BETWEEN ValidFrom AND ValidTo
    ),
    ctePivotedDiscounts AS(
      SELECT MAX( CASE WHEN pd.rn = 1 THEN pd.Stack END) AS Stack1
        ,MAX( CASE WHEN pd.rn = 2 THEN pd.Stack END) AS Stack2
        ,MAX( CASE WHEN pd.rn = 3 THEN pd.Stack END) AS Stack3
        ,p.ProductCode
        ,p.ListPrice
        ,MAX( CASE WHEN pd.rn = 1 THEN pd.Discount1 END) AS Discount1
        ,MAX( CASE WHEN pd.rn = 2 THEN pd.Discount1 END) AS Discount2
        ,MAX( CASE WHEN pd.rn = 3 THEN pd.Discount1 END) AS Discount3
        ,MAX( ValidFrom) AS ValidFrom
        ,MIN( ValidTo) AS ValidTo
      FROM cteProductDiscounts pd
      JOIN Products p ON p.ProductCode = pd.ProductCode
      GROUP BY p.ProductCode
        ,p.ListPrice
    )
    SELECT Stack1
       ,Stack2
       ,Stack3
       ,ProductCode
       ,ListPrice
       ,Discount1
       ,Discount2
       ,Discount3
       ,ListPrice * (1-Discount1) * (1-Discount2) * (1-Discount3) AS FinalPrice --Line edited after Steve's remark.
       ,ValidFrom
       ,ValidTo
    FROM ctePivotedDiscounts;
    GO
    DROP TABLE ProductDiscounts, Products

    Luis C.
    General Disclaimer:
    Are you seriously taking the advice and code from someone from the internet without testing it? Do you at least understand it? Or can it easily kill your server?

    How to post data/code on a forum to get the best help: Option 1 / Option 2
  • Luis Cazares - Monday, November 6, 2017 1:24 PM

    What about something like this to avoid creating more rows?

    CREATE TABLE ProductDiscounts (
    Stack VARCHAR (255),
    ProductCode VARCHAR (255),
    ValidFrom DATE,
    ValidTo DATE,
    Discount1 decimal(4,3));

    INSERT INTO ProductDiscounts
    (Stack, ProductCode, ValidFrom, ValidTo, Discount1)
    VALUES
    ('A','PROD40','20000101','20401231',0.1),
    ('B','PROD40','20090201','20110131',0.2),
    ('C','PROD40','20110201','20411231',0.15),
    ('D','PROD40','20090201','20401231',0.3);

    CREATE TABLE Products (
    ProductCode VARCHAR (255),
    ListPrice decimal(10,4));

    INSERT INTO Products VALUES('PROD40', 200);

    WITH cteProductDiscounts AS(
      SELECT *, ROW_NUMBER() OVER(PARTITION BY ProductCode ORDER BY ValidFrom) rn
      FROM ProductDiscounts pd
      WHERE '20090201' BETWEEN ValidFrom AND ValidTo
    ),
    ctePivotedDiscounts AS(
      SELECT MAX( CASE WHEN pd.rn = 1 THEN pd.Stack END) AS Stack1
        ,MAX( CASE WHEN pd.rn = 2 THEN pd.Stack END) AS Stack2
        ,MAX( CASE WHEN pd.rn = 3 THEN pd.Stack END) AS Stack3
        ,p.ProductCode
        ,p.ListPrice
        ,MAX( CASE WHEN pd.rn = 1 THEN pd.Discount1 END) AS Discount1
        ,MAX( CASE WHEN pd.rn = 2 THEN pd.Discount1 END) AS Discount2
        ,MAX( CASE WHEN pd.rn = 3 THEN pd.Discount1 END) AS Discount3
        ,MAX( ValidFrom) AS ValidFrom
        ,MIN( ValidTo) AS ValidTo
      FROM cteProductDiscounts pd
      JOIN Products p ON p.ProductCode = pd.ProductCode
      GROUP BY p.ProductCode
        ,p.ListPrice
    )
    SELECT Stack1
       ,Stack2
       ,Stack3
       ,ProductCode
       ,ListPrice
       ,Discount1
       ,Discount2
       ,Discount3
       ,ListPrice * (-1+Discount1) * (-1+Discount2) * (-1+Discount3) AS FinalPrice
       ,ValidFrom
       ,ValidTo
    FROM ctePivotedDiscounts;
    GO
    DROP TABLE ProductDiscounts, Products

    Don't you want (1 - Discount1) instead of (-1 + Discount1) ?   This would be true of the other two discounts as well....

    Steve (aka sgmunson) 🙂 🙂 🙂
    Rent Servers for Income (picks and shovels strategy)

  • sgmunson - Monday, November 6, 2017 2:44 PM

    Don't you want (1 - Discount1) instead of (-1 + Discount1) ?   This would be true of the other two discounts as well....

    You're right. Silly me got, I'm not sure what I was thinking.

    Luis C.
    General Disclaimer:
    Are you seriously taking the advice and code from someone from the internet without testing it? Do you at least understand it? Or can it easily kill your server?

    How to post data/code on a forum to get the best help: Option 1 / Option 2
  • Vegeta7 - Saturday, November 4, 2017 12:30 PM

    >> I have this ProductDiscounts table: <<

    Apparently you don't know that a table, by definition --Let me repeat that; by definition -- must have a key. Since everything in your non-table can be null, there's no way can ever have a key. I would've kicked you out of the first day of class

    What you posted is basically an old style COBOL punch card file declaration written in SQL. Did you know the products have standardized codes, such as UPC, GTIN, et al? Tables always have keys. By definition, not by option. You also need to put in temporal constraints. These constraints will really, really help the optimizer.

    CREATE TABLE ProductDiscounts
    (stack_name CHAR(5) NOT NULL PRIMARY KEY,
    product_gtin CHAR(15) NOT NULL,
    discount_start_date DATE NOT NULL,
    discount_end_date DATE,
    CHECK(discount_start_date <= discount_end_date) --important!
    foobar_discount_rate DECIMAL(4,3) NOT NULL
    CHECK(foobar_discount_rate > 0.000 )
    );

    INSERT INTO ProductDiscounts
    VALUES
    ('A','PROD40','2000-01-01','2040-1231',0.1),
    ('B','PROD40','2009-02-01','2011-01-31',0.2),
    ('C','PROD40','2011-02-01','2041-12-31',0.15),
    ('D','PROD40','2009-02-01','2040-12-31',0.3);

    >> Here's a look at the timeframe:
    52% of original size(was 974x128) - Click to enlarge <<

    Frankly, I do not feel like printing out and then transcribing the data that you are too lazy to put into SQL. You're asking people to help you for free instead of for a few hundred bucks an hour, and yet you do not have the ability or the willingness to simply transcribe your own data. Why should we have any respect for you?

    You will need a calendar table to locate a date within an interval in which the discount applies. This becomes a simple join.

    Also, also the habit of putting a, in front of the line of of code goes back to the days when wrote code on punch cards. We wrote the code this way (usually Fortran but sometimes COBOL) saluted rearrange the deck of punch cards. You're not thinking in sets or relationally at all.

    Please post DDL and follow ANSI/ISO standards when asking for help. 

  • Vegeta7 - Sunday, November 5, 2017 6:52 PM

    ... the end result will be to have a discount fact table, where I will have all the discounts correctly segmented by period.

    In short, I will know all the stacks and respective discounts, ListPrice and FinalPrice for any given period of time. So if a sale were to occur on the 2009-02-01, I would know that I would have to apply those 3 discounts on the ListPrice to get that FinalPrice.

    That's pretty much what I thought.  I'm thinking there's an easier method to do this, which may be faster and won't require splitting discount periods based on starts and ends of other discount periods.  I'm a bit pressed for time but I'll see if I can put something together for you in the next day.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • jcelko212 32090 - Monday, November 6, 2017 4:16 PM

    Vegeta7 - Saturday, November 4, 2017 12:30 PM

    >> I have this ProductDiscounts table: <<

    Apparently you don't know that a table, by definition --Let me repeat that; by definition -- must have a key. Since everything in your non-table can be null, there's no way can ever have a key. I would've kicked you out of the first day of class

    What you posted is basically an old style COBOL punch card file declaration written in SQL. Did you know the products have standardized codes, such as UPC, GTIN, et al? Tables always have keys. By definition, not by option. You also need to put in temporal constraints. These constraints will really, really help the optimizer.

    CREATE TABLE ProductDiscounts
    (stack_name CHAR(5) NOT NULL PRIMARY KEY,
    product_gtin CHAR(15) NOT NULL,
    discount_start_date DATE NOT NULL,
    discount_end_date DATE,
    CHECK(discount_start_date <= discount_end_date) --important!
    foobar_discount_rate DECIMAL(4,3) NOT NULL
    CHECK(foobar_discount_rate > 0.000 )
    );

    INSERT INTO ProductDiscounts
    VALUES
    ('A','PROD40','2000-01-01','2040-1231',0.1),
    ('B','PROD40','2009-02-01','2011-01-31',0.2),
    ('C','PROD40','2011-02-01','2041-12-31',0.15),
    ('D','PROD40','2009-02-01','2040-12-31',0.3);

    >> Here's a look at the timeframe:
    52% of original size(was 974x128) - Click to enlarge <<

    Frankly, I do not feel like printing out and then transcribing the data that you are too lazy to put into SQL. You're asking people to help you for free instead of for a few hundred bucks an hour, and yet you do not have the ability or the willingness to simply transcribe your own data. Why should we have any respect for you?

    You will need a calendar table to locate a date within an interval in which the discount applies. This becomes a simple join.

    Also, also the habit of putting a, in front of the line of of code goes back to the days when wrote code on punch cards. We wrote the code this way (usually Fortran but sometimes COBOL) saluted rearrange the deck of punch cards. You're not thinking in sets or relationally at all.

    CODE JOE!  Post the code for YOUR Calendar table and the solution for this problem!

    And the good thing about punched cards is that they kept people from making ridiculously wide entities... like Calendar tables.. 😉

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff Moden - Monday, November 6, 2017 5:17 PM

    Vegeta7 - Sunday, November 5, 2017 6:52 PM

    ... the end result will be to have a discount fact table, where I will have all the discounts correctly segmented by period.

    In short, I will know all the stacks and respective discounts, ListPrice and FinalPrice for any given period of time. So if a sale were to occur on the 2009-02-01, I would know that I would have to apply those 3 discounts on the ListPrice to get that FinalPrice.

    That's pretty much what I thought.  I'm thinking there's an easier method to do this, which may be faster and won't require splitting discount periods based on starts and ends of other discount periods.  I'm a bit pressed for time but I'll see if I can put something together for you in the next day.

    Knowing the possible end uses for this problem, you don't need a Calendar table and you don't need to slice and dice the discount table.  You just need to use it.  First, you need to add the following index to it.


     CREATE CLUSTERED INDEX IXC_By_ValidFrom_ValidTo ON dbo.ProductDiscounts
            (ValidFrom, ValidTo)
    ;

    Let's test things out.  I don't know exactly what you want for output for your followup tasks so I'm just showing a possibility.  To test things out and test performance in the process, here's a test data generator that will create a million orders with a date for each order.  I'm also assuming that your discount table will have more than one product in it but made no additional products for this POP code.

    Here's the test data generation code.


    --===== Create and populate a test table
    -- DROP TABLE OrderDetail;
       WITH cteGenDates AS
    (
     SELECT TOP 1000000
            OrderDate = CONVERT(DATE,DATEADD(dd,ABS(CHECKSUM(NEWID())%DATEDIFF(dd,'1999','2013')),'1999'))
       FROM      sys.all_columns ac1
      CROSS JOIN sys.all_columns ac2
    )
     SELECT  OrderID = ROW_NUMBER() OVER (ORDER BY OrderDate)
            ,OrderDate
            ,ProductCode = 'Prod40'
       INTO dbo.OrderDetail
       FROM cteGenDates
      ORDER BY OrderDate
    ;

    Here's the code that does the matching based on ProductCode and the dates, as you indicated.  It returns more than one row per order/product if there's more that one discount for that product based on the order date and the discount from and to dates.  It takes about 16 seconds on my machine.  No optimization was attempted because this is just POP code.  The method could certainly be translated to handle singleton lookups.


     SELECT  OrderID        = od.OrderID
            ,OrderDate      = od.OrderDate
            ,ProductCode    = od.ProductCode
            ,Stack          = ISNULL(pd.Stack,'No Discount')
            ,ValidFrom      = ISNULL(pd.ValidFrom,'9999')
            ,ValidTo        = ISNULL(pd.ValidTo,'9999')
            ,Discount       = ISNULL(pd.Discount1,0)
       FROM      dbo.OrderDetail      od
       LEFT JOIN dbo.ProductDiscounts pd
              ON od.OrderDate  >= pd.ValidFrom
             AND od.OrderDate  <= pd.ValidTo
             AND od.ProductCode = pd.ProductCode
      ORDER BY OrderID
    ;

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 15 posts - 1 through 15 (of 29 total)

You must be logged in to reply to this topic. Login to reply