Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 
        
Home       Members    Calendar    Who's On


Add to briefcase 12»»

Data from Previous Row_2 Expand / Collapse
Author
Message
Posted Friday, January 10, 2014 2:53 PM
Grasshopper

GrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopper

Group: General Forum Members
Last Login: Wednesday, September 3, 2014 1:43 PM
Points: 23, Visits: 126
I would like an efficient way to "fill missing data".
My preference would be a view.

I have data like:
Date, A, B
1/1/2013 5 99
1/2/2013 null 99
1/3/2013 6 null
1/4/2013 null null
1/5/2013 7 98
1/6/2013 null null

and view results:

Date, A, B
1/1/2013 5 99
1/2/2013 5 99
1/3/2013 6 99
1/4/2013 6 99
1/5/2013 7 98
1/6/2013 7 98

etc...you get the idea
caution: there are about a million rows in the raw table

Mike O.
Post #1529984
Posted Friday, January 10, 2014 4:04 PM


Hall of Fame

Hall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of Fame

Group: General Forum Members
Last Login: Today @ 6:55 PM
Points: 3,636, Visits: 8,150
Do you have continuous dates? Or you might have gaps?


Luis C.
Are you seriously taking the advice and code from someone from the internet without testing it? Do you at least understand it? Or can it easily kill your server?

Forum Etiquette: How to post data/code on a forum to get the best help
Post #1529997
Posted Friday, January 10, 2014 5:26 PM
Grasshopper

GrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopper

Group: General Forum Members
Last Login: Wednesday, September 3, 2014 1:43 PM
Points: 23, Visits: 126
might have gaps
Post #1530004
Posted Friday, January 10, 2014 9:47 PM


SSCertifiable

SSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiable

Group: General Forum Members
Last Login: Today @ 5:56 PM
Points: 7,075, Visits: 15,325
Since you're in SQL Server 2000, we can't use any CTE for this. That said - you should be able to use a temp table to get the ordering down.

You will also need a tally table to help with filling in the gaps (see Jeff's article on how to set one up)

The script would look something like the following:


declare @maxdt datetime
select @maxdt=MAX(dateval) from table1

create table #orderedGaps (seq int identity(1,1), dateval datetime, valone int, val2 int)

insert into #orderedGaps (dateval, valone,valtwo)
select date, A, B from table1
order by date

select o1.seq ,dateadd(day,rn-1,o1.dateval) ajustedDateVal,o2.dateval, o1.valone, o1.valtwo
from #orderedgaps o1 left join orderedGaps #o2 on o1.seq+1=o2.seq
join tally on tally.N<= DATEDIFF(day,o1.dateval,isnull(o2.dateval, @maxdt))
order by date



----------------------------------------------------------------------------------
Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?
Post #1530014
Posted Saturday, January 11, 2014 9:06 AM
Grasshopper

GrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopper

Group: General Forum Members
Last Login: Wednesday, September 3, 2014 1:43 PM
Points: 23, Visits: 126
I am using SQL server 2005, 2008 and 2012 so CTE's are OK
Thanks for your solution, but it looks a little messy as a view definition.
I was thinking of something using ROWNUMBER.
I think your solution has problems if there are more than one row with NULL between valid data.

what is rn? a value from your Tally table?
as in dateadd(day,rn-1,o1.dateval)
Post #1530040
Posted Saturday, January 11, 2014 1:53 PM


SSC-Dedicated

SSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-Dedicated

Group: General Forum Members
Last Login: Today @ 5:14 PM
Points: 35,266, Visits: 31,756
Mike Osborne (1/10/2014)
caution: there are about a million rows in the raw table


Any chance that you'd like to have the original data in the table updated? If not, any chance of adding two columns to the table that can be update? I ask because it seems a real waste to constantly calculate such "data smears" and they certainly won't be indexable through a view unless it's materialized as a indexed view, which would probably take more space than just adding two columns that can be updated.


--Jeff Moden
"RBAR is pronounced "ree-bar" and is a "Modenism" for "Row-By-Agonizing-Row".

First step towards the paradigm shift of writing Set Based code:
Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column."

(play on words) "Just because you CAN do something in T-SQL, doesn't mean you SHOULDN'T." --22 Aug 2013

Helpful Links:
How to post code problems
How to post performance problems
Post #1530059
Posted Saturday, January 11, 2014 2:11 PM


SSCertifiable

SSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiable

Group: General Forum Members
Last Login: Today @ 5:56 PM
Points: 7,075, Visits: 15,325
Mike Osborne (1/11/2014)
I am using SQL server 2005, 2008 and 2012 so CTE's are OK
Thanks for your solution, but it looks a little messy as a view definition.
I was thinking of something using ROWNUMBER.
I think your solution has problems if there are more than one row with NULL between valid data.

what is rn? a value from your Tally table?
as in dateadd(day,rn-1,o1.dateval)


Well yes - the solution does get a bit cleaner using rownumber and CTE's: for one thing you don't need the temp table declared (which would be the difference between a view and not).

As Jeff would describe it, if you're looking to "smear" data of a series of days but want to smear it over nulls or blanks, just exclude them from your initial orderedgaps table. This solution will NOT work if you want to smear ONE column but keep the other (that gets to be a lot more involved).

and yes - my tally table uses RN as the column name for the incrementing numbers. just missed one of the references (jeff's version uses N as the name).


----------------------------------------------------------------------------------
Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?
Post #1530061
Posted Saturday, January 11, 2014 6:02 PM
Grasshopper

GrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopper

Group: General Forum Members
Last Login: Wednesday, September 3, 2014 1:43 PM
Points: 23, Visits: 126
Can't update the original table...knowing when the data came in is an important piece of data.
adding a couple of columns and running a process once in a while to update new data is a good idea.
I guess I was hoping to find a cheap view so I would not "mess up" the source data tables.
I said the table has about a million records but the typical query would only return a thousand or less.

Thanks for your help,
Mike
Post #1530072
Posted Sunday, January 12, 2014 12:33 PM


SSC-Dedicated

SSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-Dedicated

Group: General Forum Members
Last Login: Today @ 5:14 PM
Points: 35,266, Visits: 31,756
Mike Osborne (1/11/2014)
Can't update the original table...knowing when the data came in is an important piece of data.
adding a couple of columns and running a process once in a while to update new data is a good idea.
I guess I was hoping to find a cheap view so I would not "mess up" the source data tables.
I said the table has about a million records but the typical query would only return a thousand or less.

Thanks for your help,
Mike


You could probably get away with a MAX where the date is less than date x. You'd need some decent indexing for it to be quick. In 2K12, you could use Lead/Lag functionality for it. It just seems a shame to keep wasting clock cycles on something that will never change.


--Jeff Moden
"RBAR is pronounced "ree-bar" and is a "Modenism" for "Row-By-Agonizing-Row".

First step towards the paradigm shift of writing Set Based code:
Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column."

(play on words) "Just because you CAN do something in T-SQL, doesn't mean you SHOULDN'T." --22 Aug 2013

Helpful Links:
How to post code problems
How to post performance problems
Post #1530142
Posted Sunday, January 12, 2014 1:00 PM


SSC-Dedicated

SSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-Dedicated

Group: General Forum Members
Last Login: Today @ 5:14 PM
Points: 35,266, Visits: 31,756
Mike Osborne (1/10/2014)
I would like an efficient way to "fill missing data".
My preference would be a view.

I have data like:
Date, A, B
1/1/2013 5 99
1/2/2013 null 99
1/3/2013 6 null
1/4/2013 null null
1/5/2013 7 98
1/6/2013 null null

and view results:

Date, A, B
1/1/2013 5 99
1/2/2013 5 99
1/3/2013 6 99
1/4/2013 6 99
1/5/2013 7 98
1/6/2013 7 98

etc...you get the idea
caution: there are about a million rows in the raw table

Mike O.


You're kind of new here so a couple of hints, first.

Always post in the correct form. I don't know why you posted this in a 2000 forum and it only led to some confusion.

If you want coded answers, please see the first link under "Helpful Links" in my signature line below.

I'll take care of that this time because you're new.

Here's the kind of code that I was talking about. It actually has to hit the table 3 times. While it can be made to work fairly quickly, it just seems unnecessary since the original data will never change. You could even make a sister table to hold prefilled columns and keep them updated with a well written trigger.

DROP TABLE #TestTable
GO
--===== Create the test table (not a part of the solution)
CREATE TABLE #TestTable
(
Date DATETIME NOT NULL PRIMARY KEY CLUSTERED
,A INT
,B INT
)
;
--===== Populate the table with test data (not a part of the solution)
INSERT INTO #TestTable
SELECT '1/1/2013',5 ,99 UNION ALL
SELECT '1/2/2013',null,99 UNION ALL
SELECT '1/3/2013',6 ,null UNION ALL
SELECT '1/4/2013',null,null UNION ALL
SELECT '1/5/2013',7 ,98 UNION ALL
SELECT '1/6/2013',null,null
;
--===== Solve the problem using conventional means
SELECT tt.Date
,a.A
,b.B
FROM #TestTable tt
CROSS APPLY (SELECT TOP 1 tta.A FROM #TestTable tta WHERE tta.Date <= tt.Date AND tta.A IS NOT NULL ORDER BY tta.Date DESC)a(A)
CROSS APPLY (SELECT TOP 1 ttb.B FROM #TestTable ttb WHERE ttb.Date <= tt.Date AND ttb.B IS NOT NULL ORDER BY ttb.Date DESC)b(B)
ORDER BY tt.Date
;


Result set for the currently provided data and problem definition...
Date                    A           B
----------------------- ----------- -----------
2013-01-01 00:00:00.000 5 99
2013-01-02 00:00:00.000 5 99
2013-01-03 00:00:00.000 6 99
2013-01-04 00:00:00.000 6 99
2013-01-05 00:00:00.000 7 98
2013-01-06 00:00:00.000 7 98

(6 row(s) affected)



--Jeff Moden
"RBAR is pronounced "ree-bar" and is a "Modenism" for "Row-By-Agonizing-Row".

First step towards the paradigm shift of writing Set Based code:
Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column."

(play on words) "Just because you CAN do something in T-SQL, doesn't mean you SHOULDN'T." --22 Aug 2013

Helpful Links:
How to post code problems
How to post performance problems
Post #1530149
« Prev Topic | Next Topic »

Add to briefcase 12»»

Permissions Expand / Collapse