Gaps and Islands with previous possible values

  • Hi there,

    create table TestData (ID int,TestDate date, Score int)

    insert into testdata Values ('1','1-Jan-16','10')

    insert into testdata Values ('1','5-Jan-16','15')

    insert into testdata Values ('1','10-Jan-16','5')

    Create table Calendartable (dt date)

    insert into Calendartable values ('2016/01/01')

    insert into Calendartable values ('2016/01/02')

    insert into Calendartable values ('2016/01/03')

    insert into Calendartable values ('2016/01/04')

    insert into Calendartable values ('2016/01/05')

    insert into Calendartable values ('2016/01/06')

    insert into Calendartable values ('2016/01/07')

    insert into Calendartable values ('2016/01/08')

    insert into Calendartable values ('2016/01/09')

    insert into Calendartable values ('2016/01/10')

    used calendar table to fill gaps in testdata , by doing left join to calendar table.

    select a.*,b.dt as dates from TestData a right join Calendartable b

    on a.testdate = b.dt

    Now testdata columns will be filled with null.

    Instead of null I require to fill with the previous value of the testdata .

    (i.e., )

    * by the above example we have gap between 1-Jan-16 and 5-Jan'16

    * 3rd and 4th date is missing

    * testdata columns will be null in right join. But instead I require to fill 1-Jan-16 data for 2nd, 3rd and 4th date.

    * vise versa for all the gaps

    expected result would be,

  • SELECT

    a.ID, a.TestDate, a.Score,

    b.dt AS dates

    FROM Calendartable b

    OUTER APPLY (

    SELECT TOP 1 a.ID, a.TestDate, a.Score

    FROM testdata a

    WHERE a.TestDate <= b.dt

    ORDER BY a.TestDate DESC

    ) a

    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • ChrisM@Work (6/22/2016)


    SELECT

    a.ID, a.TestDate, a.Score,

    b.dt AS dates

    FROM Calendartable b

    OUTER APPLY (

    SELECT TOP 1 a.ID, a.TestDate, a.Score

    FROM testdata a

    WHERE a.TestDate <= b.dt

    ORDER BY a.TestDate DESC

    ) a

    Test data is huge data set on real time. so sorting & outer apply would cause severe damage. Is there any other possible way ??

  • squvi.87 (6/22/2016)


    ChrisM@Work (6/22/2016)


    SELECT

    a.ID, a.TestDate, a.Score,

    b.dt AS dates

    FROM Calendartable b

    OUTER APPLY (

    SELECT TOP 1 a.ID, a.TestDate, a.Score

    FROM testdata a

    WHERE a.TestDate <= b.dt

    ORDER BY a.TestDate DESC

    ) a

    Test data is huge data set on real time. so sorting & outer apply would cause severe damage. Is there any other possible way ??

    "severe damage"?

    How many rows are you expecting the query to return, and what consumes those rows?

    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • You could use the Quirky Update. Be sure to read and understand the following article as you might get this wrong if you don't know what you're doing. http://www.sqlservercentral.com/articles/T-SQL/68467/

    After that, the code is not that complex.

    CREATE TABLE #Result(

    ID int,

    TestDate date,

    Score int,

    dates date

    );

    CREATE CLUSTERED INDEX CUResult ON #Result(dates);

    INSERT INTO #Result WITH (TABLOCKX)

    select a.*,

    b.dt as dates

    from TestData a

    right join Calendartable b on a.testdate = b.dt;

    DECLARE @ID int,

    @TestDate date,

    @Score int,

    @dates date

    UPDATE #Result WITH(TABLOCKX) SET

    @ID = ID = CASE WHEN ID IS NULL THEN @ID ELSE ID END,

    @TestDate = TestDate = CASE WHEN TestDate IS NULL THEN @TestDate ELSE TestDate END,

    @Score = Score = CASE WHEN Score IS NULL THEN @Score ELSE Score END,

    @dates = dates

    OPTION (MAXDOP 1);

    SELECT *

    FROM #Result

    ORDER BY dates;

    DROP TABLE #Result;

    Luis C.
    General Disclaimer:
    Are you seriously taking the advice and code from someone from the internet without testing it? Do you at least understand it? Or can it easily kill your server?

    How to post data/code on a forum to get the best help: Option 1 / Option 2
  • "severe damage"?

    How many rows are you expecting the query to return, and what consumes those rows?

    250 MILLION

  • Luis Cazares (6/22/2016)


    You could use the Quirky Update. Be sure to read and understand the following article as you might get this wrong if you don't know what you're doing. http://www.sqlservercentral.com/articles/T-SQL/68467/

    UPDATE #Result WITH(TABLOCKX) SET

    @ID = ID = CASE WHEN ID IS NULL THEN @ID ELSE ID END,

    @TestDate = TestDate = CASE WHEN TestDate IS NULL THEN @TestDate ELSE TestDate END,

    @Score = Score = CASE WHEN Score IS NULL THEN @Score ELSE Score END,

    @dates = dates

    OPTION (MAXDOP 1);

    DROP TABLE #Result;

    [/code]

    THANKS FOR THE CODE. BUT WHY MAXDOP MANUALLY SET TO 1 ? IS THERE SPECIFIC REASON FOR THAT ?

  • squvi.87 (6/23/2016)


    "severe damage"?

    How many rows are you expecting the query to return, and what consumes those rows?

    250 MILLION

    and what consumes those rows? What are you going to do with the results?

    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • squvi.87 (6/23/2016)


    Luis Cazares (6/22/2016)


    You could use the Quirky Update. Be sure to read and understand the following article as you might get this wrong if you don't know what you're doing. http://www.sqlservercentral.com/articles/T-SQL/68467/

    UPDATE #Result WITH(TABLOCKX) SET

    @ID = ID = CASE WHEN ID IS NULL THEN @ID ELSE ID END,

    @TestDate = TestDate = CASE WHEN TestDate IS NULL THEN @TestDate ELSE TestDate END,

    @Score = Score = CASE WHEN Score IS NULL THEN @Score ELSE Score END,

    @dates = dates

    OPTION (MAXDOP 1);

    DROP TABLE #Result;

    [/code]

    THANKS FOR THE CODE. BUT WHY MAXDOP MANUALLY SET TO 1 ? IS THERE SPECIFIC REASON FOR THAT ?

    YES THERE IS sorry yes there is. Multiple threads may break the process. Read the article.

    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • squvi.87 (6/23/2016)


    Luis Cazares (6/22/2016)


    You could use the Quirky Update. Be sure to read and understand the following article as you might get this wrong if you don't know what you're doing. http://www.sqlservercentral.com/articles/T-SQL/68467/

    UPDATE #Result WITH(TABLOCKX) SET

    @ID = ID = CASE WHEN ID IS NULL THEN @ID ELSE ID END,

    @TestDate = TestDate = CASE WHEN TestDate IS NULL THEN @TestDate ELSE TestDate END,

    @Score = Score = CASE WHEN Score IS NULL THEN @Score ELSE Score END,

    @dates = dates

    OPTION (MAXDOP 1);

    DROP TABLE #Result;

    [/code]

    THANKS FOR THE CODE. BUT WHY MAXDOP MANUALLY SET TO 1 ? IS THERE SPECIFIC REASON FOR THAT ?

    if you read the full article there is a list of "rules" at the end...... http://www.sqlservercentral.com/articles/T-SQL/68467/

    one of which is

    2. PARALLELISM MUST BE PREVENTED: You MUST prevent parallelism from occurring. Therefore, you MUST include OPTION (MAXDOP 1) in any such code.

    Can you please confrim which version of SQL you are using?....from looking at some of your other posts you have previously indicated 2012/2014.....

    ________________________________________________________________
    you can lead a user to data....but you cannot make them think
    and remember....every day is a school day

  • for clarification please....what results would you expect from this data set

    create table TestData (ID int,TestDate date, Score int)

    insert into testdata Values ('1','1-Jan-16','10')

    insert into testdata Values ('1','5-Jan-16','15')

    insert into testdata Values ('1','10-Jan-16','5')

    -- added

    insert into testdata Values ('2','5-Jan-16','10')

    insert into testdata Values ('2','8-Jan-16','10')

    ________________________________________________________________
    you can lead a user to data....but you cannot make them think
    and remember....every day is a school day

Viewing 11 posts - 1 through 10 (of 10 total)

You must be logged in to reply to this topic. Login to reply