Data from Previous Row_2

  • I would like an efficient way to "fill missing data".

    My preference would be a view.

    I have data like:

    Date, A, B

    1/1/2013 5 99

    1/2/2013 null 99

    1/3/2013 6 null

    1/4/2013 null null

    1/5/2013 7 98

    1/6/2013 null null

    and view results:

    Date, A, B

    1/1/2013 5 99

    1/2/2013 5 99

    1/3/2013 6 99

    1/4/2013 6 99

    1/5/2013 7 98

    1/6/2013 7 98

    etc...you get the idea

    caution: there are about a million rows in the raw table

    Mike O.

  • Do you have continuous dates? Or you might have gaps?

    Luis C.
    General Disclaimer:
    Are you seriously taking the advice and code from someone from the internet without testing it? Do you at least understand it? Or can it easily kill your server?

    How to post data/code on a forum to get the best help: Option 1 / Option 2
  • might have gaps

  • Since you're in SQL Server 2000, we can't use any CTE for this. That said - you should be able to use a temp table to get the ordering down.

    You will also need a tally table to help with filling in the gaps (see Jeff's article on how to set one up[/url])

    The script would look something like the following:

    declare @maxdt datetime

    select @maxdt=MAX(dateval) from table1

    create table #orderedGaps (seq int identity(1,1), dateval datetime, valone int, val2 int)

    insert into #orderedGaps (dateval, valone,valtwo)

    select date, A, B from table1

    order by date

    select o1.seq ,dateadd(day,rn-1,o1.dateval) ajustedDateVal,o2.dateval, o1.valone, o1.valtwo

    from #orderedgaps o1 left join orderedGaps #o2 on o1.seq+1=o2.seq

    join tally on tally.N<= DATEDIFF(day,o1.dateval,isnull(o2.dateval, @maxdt))

    order by date

    ----------------------------------------------------------------------------------
    Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?

  • I am using SQL server 2005, 2008 and 2012 so CTE's are OK

    Thanks for your solution, but it looks a little messy as a view definition.

    I was thinking of something using ROWNUMBER.

    I think your solution has problems if there are more than one row with NULL between valid data.

    what is rn? a value from your Tally table?

    as in dateadd(day,rn-1,o1.dateval)

  • Mike Osborne (1/10/2014)


    caution: there are about a million rows in the raw table

    Any chance that you'd like to have the original data in the table updated? If not, any chance of adding two columns to the table that can be update? I ask because it seems a real waste to constantly calculate such "data smears" and they certainly won't be indexable through a view unless it's materialized as a indexed view, which would probably take more space than just adding two columns that can be updated.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Mike Osborne (1/11/2014)


    I am using SQL server 2005, 2008 and 2012 so CTE's are OK

    Thanks for your solution, but it looks a little messy as a view definition.

    I was thinking of something using ROWNUMBER.

    I think your solution has problems if there are more than one row with NULL between valid data.

    what is rn? a value from your Tally table?

    as in dateadd(day,rn-1,o1.dateval)

    Well yes - the solution does get a bit cleaner using rownumber and CTE's: for one thing you don't need the temp table declared (which would be the difference between a view and not).

    As Jeff would describe it, if you're looking to "smear" data of a series of days but want to smear it over nulls or blanks, just exclude them from your initial orderedgaps table. This solution will NOT work if you want to smear ONE column but keep the other (that gets to be a lot more involved).

    and yes - my tally table uses RN as the column name for the incrementing numbers. just missed one of the references (jeff's version uses N as the name).

    ----------------------------------------------------------------------------------
    Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?

  • Can't update the original table...knowing when the data came in is an important piece of data.

    adding a couple of columns and running a process once in a while to update new data is a good idea.

    I guess I was hoping to find a cheap view so I would not "mess up" the source data tables.

    I said the table has about a million records but the typical query would only return a thousand or less.

    Thanks for your help,

    Mike

  • Mike Osborne (1/11/2014)


    Can't update the original table...knowing when the data came in is an important piece of data.

    adding a couple of columns and running a process once in a while to update new data is a good idea.

    I guess I was hoping to find a cheap view so I would not "mess up" the source data tables.

    I said the table has about a million records but the typical query would only return a thousand or less.

    Thanks for your help,

    Mike

    You could probably get away with a MAX where the date is less than date x. You'd need some decent indexing for it to be quick. In 2K12, you could use Lead/Lag functionality for it. It just seems a shame to keep wasting clock cycles on something that will never change.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Mike Osborne (1/10/2014)


    I would like an efficient way to "fill missing data".

    My preference would be a view.

    I have data like:

    Date, A, B

    1/1/2013 5 99

    1/2/2013 null 99

    1/3/2013 6 null

    1/4/2013 null null

    1/5/2013 7 98

    1/6/2013 null null

    and view results:

    Date, A, B

    1/1/2013 5 99

    1/2/2013 5 99

    1/3/2013 6 99

    1/4/2013 6 99

    1/5/2013 7 98

    1/6/2013 7 98

    etc...you get the idea

    caution: there are about a million rows in the raw table

    Mike O.

    You're kind of new here so a couple of hints, first.

    Always post in the correct form. I don't know why you posted this in a 2000 forum and it only led to some confusion.

    If you want coded answers, please see the first link under "Helpful Links" in my signature line below.

    I'll take care of that this time because you're new.

    Here's the kind of code that I was talking about. It actually has to hit the table 3 times. While it can be made to work fairly quickly, it just seems unnecessary since the original data will never change. You could even make a sister table to hold prefilled columns and keep them updated with a well written trigger.

    DROP TABLE #TestTable

    GO

    --===== Create the test table (not a part of the solution)

    CREATE TABLE #TestTable

    (

    Date DATETIME NOT NULL PRIMARY KEY CLUSTERED

    ,A INT

    ,B INT

    )

    ;

    --===== Populate the table with test data (not a part of the solution)

    INSERT INTO #TestTable

    SELECT '1/1/2013',5 ,99 UNION ALL

    SELECT '1/2/2013',null,99 UNION ALL

    SELECT '1/3/2013',6 ,null UNION ALL

    SELECT '1/4/2013',null,null UNION ALL

    SELECT '1/5/2013',7 ,98 UNION ALL

    SELECT '1/6/2013',null,null

    ;

    --===== Solve the problem using conventional means

    SELECT tt.Date

    ,a.A

    ,b.B

    FROM #TestTable tt

    CROSS APPLY (SELECT TOP 1 tta.A FROM #TestTable tta WHERE tta.Date <= tt.Date AND tta.A IS NOT NULL ORDER BY tta.Date DESC)a(A)

    CROSS APPLY (SELECT TOP 1 ttb.B FROM #TestTable ttb WHERE ttb.Date <= tt.Date AND ttb.B IS NOT NULL ORDER BY ttb.Date DESC)b(B)

    ORDER BY tt.Date

    ;

    Result set for the currently provided data and problem definition...

    Date A B

    ----------------------- ----------- -----------

    2013-01-01 00:00:00.000 5 99

    2013-01-02 00:00:00.000 5 99

    2013-01-03 00:00:00.000 6 99

    2013-01-04 00:00:00.000 6 99

    2013-01-05 00:00:00.000 7 98

    2013-01-06 00:00:00.000 7 98

    (6 row(s) affected)

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Thanks, I agree using the solution to fill a sister table with the result and a task to keep it up to date makes the most efficient sense.

    Sorry about posting to the wrong forum, I didn't notice that when I posted based on a previous post.

    Thanks for your help.

Viewing 11 posts - 1 through 10 (of 10 total)

You must be logged in to reply to this topic. Login to reply