T-SQL query with loop ?

  • Hello,

    I would really appreciate your help or some direction. i am familiar with all joins, Case and usual sql formats, but the below requirement has me stumped.

    Hopefully i can explain the need well: (Tables and data given below)

    Rules:

    1. When (TodaysDate >= A.BusinessDate - B.NumberofDays) then use B.Value column

    Additional rules:

    a. if more than one row satisifes then use the least NumberofDays

    b. If there are no rows which satisfies rule 1 then use Value where NumberofDays = NULL

    Example:

    Say TodaysDate = May 13 (using a getdate())

    For A.SeasonID = 100, A.RouteID = 10, A.BusinessDate = 2014-05-17 there are 3 matching rows in table B

    Applying the rules now -

    For B.ReleaseID =1 : Since May 13th > ( May 17-7) we need B.Value = 30

    For B.ReleaseID =2 : Even though May 13 > (May 17) -28 we ignore it, since (May 17) -7 is a closer value.

    For B.ReleaseID =3 : If both above conditions dont match we need the Value 20 corresponding to Numberofdays = NULL

    CREATE TABLE #A

    (

    MainId INT,SeasonId INT,RouteId INT,BusinessDate DATE

    )

    CREATE TABLE #B

    (

    ReleaseId INT,SeasonId INT,RouteId INT

    ,ReleaseNumber INT,NumberOfDays INT NULL,

    Value INT

    )

    CREATE TABLE #C

    (

    BusinessDate DATE,SeasonId INT,RouteId INT

    ,ReleaseNumber INT,Value INT

    )

    INSERT INTO #A

    SELECT 1,100,10,'2014/05/17' UNION

    SELECT 2,100,9,'2014/05/17' UNION

    SELECT 4,100,10,'2014/05/22' UNION

    SELECT 5,100,9,'2014/05/22'

    INSERT INTO #B

    SELECT 1,100,10,1,NULL,20 UNION

    SELECT 2,100,10,2,7,30 UNION

    SELECT 3,100,10,3,14,40 UNION

    SELECT 4,100,9,1,NULL,50

    INSERT INTO #C

    SELECT '5/17/2014',100,9,1,50 UNION

    SELECT '5/17/2014',100,10,2,30 UNION

    SELECT '5/22/2014',100,9,1,50 UNION

    SELECT '5/22/2014',100,10,3,40

    SELECT * FROM #B ORDER BY 3

    SELECT * FROM #A ORDER BY 3

    SELECT * FROM #C

    Final Output needed is given in #C. I manually entered the output values needed for reference.

    Regards

    Mathew

  • /* You missed out the ddl for table #C, the ReleaseNumber on the

    last row of your sample data is probably incorrect, and your spec is sloppy

    and vague. That's why your post has been up for five hours with no responses.

    This query comes close enough as a trial run. Note that it's NOT optimised,

    it's laid out in a way to facilitate development and understanding.

    Check the query against a more substantial sample set and post back inconsistencies.

    If it's correct, there are a number of changes to make before

    letting it anywhere near a production environment. */

    SELECT

    a.BusinessDate,

    a.SeasonID,

    a.RouteID,

    b.ReleaseNumber,

    b.Value

    FROM #A a

    CROSS APPLY (

    SELECT b.ReleaseID, b.ReleaseNumber, b.NumberOfDays, b.Value,

    x.TodaysDate, x.NewDate,

    y.DaysDiff,

    rn = ROW_NUMBER() OVER(ORDER BY CASE

    WHEN y.DaysDiff < 0 THEN 9999

    WHEN y.DaysDiff IS NULL THEN 9999

    ELSE y.DaysDiff END)

    FROM #B b

    CROSS APPLY (

    SELECT

    TodaysDate = CAST(GETDATE()-1 AS DATE),

    NewDate = DATEADD(day, 0-B.NumberofDays, A.BusinessDate)

    ) x

    CROSS APPLY (

    SELECT DaysDiff = DATEDIFF(day, x.NewDate, x.TodaysDate)

    ) y

    WHERE b.SeasonID = a.SeasonID

    AND b.RouteID = a.RouteID

    ) b

    WHERE rn = 1

    ORDER BY a.BusinessDate, a.SeasonID, a.RouteID

    β€œWrite the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • Thanks much Chris. You are awesome! πŸ™‚ Thanks for working through my unclear specs and data. My apologies.

    Your solution works well on larger sample size.

    Could you mention the changes needed to the current code.

  • MathewK (5/14/2014)


    Thanks much Chris. You are awesome! πŸ™‚ Thanks for working through my unclear specs and data. My apologies.

    Your solution works well on larger sample size.

    Could you mention the changes needed to the current code.

    Thanks, that's very kind.

    Firstly, rip out unnecessary columns from the subquery for table B. Less data to carry through, more chance of an index matching the query. I've made a couple of small changes to your sample set to force usage of an index.

    Secondly, cut down on the CROSS APPLY cascades, they can get expensive very quickly.

    Thirdly, try the TOP equivalent to ROW_NUMBER used in the original query.

    DROP TABLE #A

    CREATE TABLE #A (MainId INT NOT NULL PRIMARY KEY,SeasonId INT,RouteId INT,BusinessDate DATE)

    INSERT INTO #A

    SELECT 1,100,10,'2014/05/17' UNION

    SELECT 2,100,9,'2014/05/17' UNION

    SELECT 4,100,10,'2014/05/22' UNION

    SELECT 5,100,9,'2014/05/22'

    DROP TABLE #B

    CREATE TABLE #B (ReleaseId INT NOT NULL PRIMARY KEY,SeasonId INT,RouteId INT,ReleaseNumber INT,NumberOfDays INT NULL,Value INT, Dummy VARCHAR(MAX))

    CREATE INDEX ix_Helper ON #B (SeasonId, RouteId) INCLUDE (ReleaseNumber, NumberofDays, Value)

    INSERT INTO #B

    SELECT 1,100,10,1,NULL,20, REPLICATE('A',8000) UNION

    SELECT 2,100,10,2,7,30, 'A' UNION

    SELECT 3,100,10,3,14,40, 'A' UNION

    SELECT 4,100,9,1,NULL,50, 'A'

    --============================================================================

    SELECT

    a.BusinessDate,

    a.SeasonID,

    a.RouteID,

    b.ReleaseNumber,

    b.Value

    FROM #A a

    CROSS APPLY ( -- b

    SELECT

    b.ReleaseNumber, b.Value, -- just the columns you need

    rn = ROW_NUMBER() OVER(ORDER BY CASE

    WHEN y.DaysDiff < 0 THEN 9999

    WHEN y.DaysDiff IS NULL THEN 9999

    ELSE y.DaysDiff END)

    FROM #B b

    CROSS APPLY ( -- cascaded CROSS APPLY can get expensive

    SELECT DaysDiff = DATEDIFF(day,

    DATEADD(day, 0-b.NumberofDays, a.BusinessDate),

    CAST(GETDATE()-1 AS DATE))

    ) y

    WHERE b.SeasonID = a.SeasonID

    AND b.RouteID = a.RouteID

    ) b

    WHERE rn = 1

    ORDER BY a.BusinessDate, a.SeasonID, a.RouteID

    -- ==============================================================

    SELECT

    a.BusinessDate,

    a.SeasonID,

    a.RouteID,

    b.ReleaseNumber,

    b.Value

    FROM #A a

    CROSS APPLY ( -- b

    SELECT TOP 1

    b.ReleaseNumber, b.Value -- just the columns you need

    FROM #B b

    CROSS APPLY ( -- cascaded CROSS APPLY can get expensive

    SELECT DaysDiff = DATEDIFF(day,

    DATEADD(day, 0-b.NumberofDays, a.BusinessDate),

    CAST(GETDATE()-1 AS DATE))

    ) y

    WHERE b.SeasonID = a.SeasonID

    AND b.RouteID = a.RouteID

    ORDER BY CASE

    WHEN y.DaysDiff < 0 THEN 9999

    WHEN y.DaysDiff IS NULL THEN 9999

    ELSE y.DaysDiff END

    ) b

    ORDER BY a.BusinessDate, a.SeasonID, a.RouteID

    β€œWrite the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • Dear Chris - thanks much for your kind help and your detailed answer!! This helps a lot. πŸ™‚

    Regards

    Mathew

Viewing 5 posts - 1 through 5 (of 5 total)

You must be logged in to reply to this topic. Login to reply