Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 
        
Home       Members    Calendar    Who's On


Add to briefcase 123»»»

Update Gaps Expand / Collapse
Author
Message
Posted Thursday, December 06, 2012 5:36 PM
SSC Journeyman

SSC JourneymanSSC JourneymanSSC JourneymanSSC JourneymanSSC JourneymanSSC JourneymanSSC JourneymanSSC Journeyman

Group: General Forum Members
Last Login: Monday, July 15, 2013 2:52 PM
Points: 89, Visits: 385
I'm trying to update gaps in a table using a set based approach, but have been struggling with the solution. I've read a few gaps and islands posts in this forum, and was hopeful that I could figure it out, but I'm at a loss. Here's the sql to create the table and populate it with data.

CREATE TABLE dbo.UpdateGaps(
MachineID int NOT NULL,
RecordedDate datetime NOT NULL,
Pressure float NULL,
CONSTRAINT PK_UpdateGaps PRIMARY KEY CLUSTERED (MachineID,RecordedDate)
)

INSERT dbo.UpdateGaps (MachineID, RecordedDate, Pressure) VALUES (60, '2012-10-02', 0);
INSERT dbo.UpdateGaps (MachineID, RecordedDate, Pressure) VALUES (60, '2012-10-03', 300);
INSERT dbo.UpdateGaps (MachineID, RecordedDate, Pressure) VALUES (60, '2012-10-04', 300);
INSERT dbo.UpdateGaps (MachineID, RecordedDate, Pressure) VALUES (60, '2012-10-05', 0);
INSERT dbo.UpdateGaps (MachineID, RecordedDate, Pressure) VALUES (60, '2012-10-06', 0);
INSERT dbo.UpdateGaps (MachineID, RecordedDate, Pressure) VALUES (60, '2012-10-07', 0);
INSERT dbo.UpdateGaps (MachineID, RecordedDate, Pressure) VALUES (60, '2012-10-08', 330);
INSERT dbo.UpdateGaps (MachineID, RecordedDate, Pressure) VALUES (60, '2012-10-09', 0);
INSERT dbo.UpdateGaps (MachineID, RecordedDate, Pressure) VALUES (60, '2012-10-10', 300);
INSERT dbo.UpdateGaps (MachineID, RecordedDate, Pressure) VALUES (97, '2012-10-02', 380);
INSERT dbo.UpdateGaps (MachineID, RecordedDate, Pressure) VALUES (97, '2012-10-03', 0);
INSERT dbo.UpdateGaps (MachineID, RecordedDate, Pressure) VALUES (97, '2012-10-04', 0);
INSERT dbo.UpdateGaps (MachineID, RecordedDate, Pressure) VALUES (97, '2012-10-05', 350);
INSERT dbo.UpdateGaps (MachineID, RecordedDate, Pressure) VALUES (97, '2012-10-06', 350);
INSERT dbo.UpdateGaps (MachineID, RecordedDate, Pressure) VALUES (97, '2012-10-07', 0);
INSERT dbo.UpdateGaps (MachineID, RecordedDate, Pressure) VALUES (97, '2012-10-08', 0);
INSERT dbo.UpdateGaps (MachineID, RecordedDate, Pressure) VALUES (97, '2012-10-09', 400);
INSERT dbo.UpdateGaps (MachineID, RecordedDate, Pressure) VALUES (97, '2012-10-10', 0);
GO

What I need to do is fill in the zero values with the value that precedes it (grouping by MachineID). Non-zero values do not change.

MachineID	RecordedDate	Pressure
--------- ------------ --------
60 2012-10-02 0 <-- no change (nothing precedes this record)
60 2012-10-03 300 <-- no change
60 2012-10-04 300 <-- no change
60 2012-10-05 0 <-- update to 300
60 2012-10-06 0 <-- update to 300
60 2012-10-07 0 <-- update to 300
60 2012-10-08 330 <-- no change
60 2012-10-09 0 <-- update to 330
60 2012-10-10 300 <-- no change
97 2012-10-02 380 <-- no change (new MachineID)
97 2012-10-03 0 <-- update to 380
97 2012-10-04 0 <-- update to 380
97 2012-10-05 350 <-- no change
97 2012-10-06 350 <-- no change
97 2012-10-07 0 <-- update to 350
97 2012-10-08 0 <-- update to 350
97 2012-10-09 400 <-- no change
97 2012-10-10 0 <-- update to 400

I read this yesterday, http://www.manning.com/nielsen/nielsenMEAP_freeCh5.pdf, which gave me hope, but I still can't figure out how to update the table without using a cursor. Using the information in the PDF, I created the query below which gives a date range for the gaps. This is where I was hopeful that I could figure it out, but I still keep falling back to a cursor based approach.

WITH C AS
(
SELECT MachineID, RecordedDate, Pressure,
ROW_NUMBER() OVER(ORDER BY MachineID, RecordedDate)
- ROW_NUMBER() OVER(ORDER BY Pressure,MachineID,RecordedDate) AS grp
FROM dbo.UpdateGaps
)
SELECT MachineID, MIN(RecordedDate) AS mn, MAX(RecordedDate) AS mx, Pressure
FROM C
GROUP BY MachineID, Pressure, grp
ORDER BY MachineID, mn;

If anyone has done this sort of thing before, I would greatly appreciate any help you have to offer. And thank you, capnhector, for linking to the PDF in the thread, "Solve Problems Using Recursive CTE". That's a great source of information on gaps and islands. Thanks to Itzik Ben-Gan for writing it.
Post #1393805
Posted Thursday, December 06, 2012 5:50 PM


Hall of Fame

Hall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of Fame

Group: General Forum Members
Last Login: Yesterday @ 6:03 PM
Points: 3,590, Visits: 5,098
You can try a quirky update (QU).

DECLARE @MachineID INT = 0
,@Pressure FLOAT = 0

UPDATE u WITH(TABLOCKX)
SET Pressure = CASE WHEN Pressure = 0
THEN @Pressure ELSE Pressure END
,@Pressure = CASE WHEN Pressure = 0 AND @MachineID = MachineID
THEN @Pressure ELSE Pressure END
,@MachineID = MachineID
FROM dbo.UpdateGaps u
OPTION(MAXDOP 1)


Edit: Ooops! Initially forgot to account for change in Machine ID.



My mantra: No loops! No CURSORs! No RBAR! Hoo-uh!

My thought question: Have you ever been told that your query runs too fast?

My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.


Need to UNPIVOT? Why not CROSS APPLY VALUES instead?
Since random numbers are too important to be left to chance, let's generate some!
Learn to understand recursive CTEs by example.
Splitting strings based on patterns can be fast!
Post #1393812
Posted Thursday, December 06, 2012 6:34 PM


SSC-Dedicated

SSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-Dedicated

Group: General Forum Members
Last Login: Today @ 3:21 PM
Points: 35,958, Visits: 30,249
I don't know what others call this type of problem but I call them "data smears" because it's like you smearing strips of color down from one wet paint spot to another.

I don't believe the following will be quite as fast as a Quirky Update, but it'll blow the doors off a cursor especially in the face of the correct index. And thank you VERY much for posting such clear requirements and readily consumable data!

 UPDATE ug
SET Pressure = ca.Pressure
FROM dbo.UpdateGaps ug
CROSS APPLY (SELECT TOP 1 Pressure
FROM dbo.UpdateGaps ugca
WHERE ugca.MachineID = ug.MachineID
AND ugca.RecordedDate <= ug.RecordedDate
AND ugca.Pressure > 0
ORDER BY ugca.MachineID,ugca.RecordedDate DESC) ca
;



--Jeff Moden
"RBAR is pronounced "ree-bar" and is a "Modenism" for "Row-By-Agonizing-Row".

First step towards the paradigm shift of writing Set Based code:
Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column."

"Change is inevitable. Change for the better is not." -- 04 August 2013
(play on words) "Just because you CAN do something in T-SQL, doesn't mean you SHOULDN'T." --22 Aug 2013

Helpful Links:
How to post code problems
How to post performance problems
Post #1393818
Posted Thursday, December 06, 2012 6:44 PM


Hall of Fame

Hall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of Fame

Group: General Forum Members
Last Login: Yesterday @ 6:03 PM
Points: 3,590, Visits: 5,098
Jeff Moden (12/6/2012)

I don't believe the following won't be quite as fast as a Quirky Update...


Easy for you to say...



My mantra: No loops! No CURSORs! No RBAR! Hoo-uh!

My thought question: Have you ever been told that your query runs too fast?

My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.


Need to UNPIVOT? Why not CROSS APPLY VALUES instead?
Since random numbers are too important to be left to chance, let's generate some!
Learn to understand recursive CTEs by example.
Splitting strings based on patterns can be fast!
Post #1393824
Posted Thursday, December 06, 2012 6:45 PM
Forum Newbie

Forum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum Newbie

Group: General Forum Members
Last Login: Thursday, December 13, 2012 6:34 PM
Points: 6, Visits: 58
with updatebase_cte (MachineID,RecordedDate,Pressure,ind) as (
select s.MachineID,s.RecordedDate,s.Pressure, 0 as ind
from yourtable s
where s.RecordedDate='2012-10-02'
union all
select s.MachineID,s.RecordedDate,
case when s.Pressure = 0 and sc.Pressure <> 0 then sc.Pressure else s.Pressure end as Pressure,
case when s.Pressure = 0 and sc.Pressure <> 0 then 1 else 0 end as ind
from yourtable s
inner join updatebase_cte sc
on (s.RecordedDate = dateadd(day,1,sc.RecordedDate) and s.MachineID=sc.MachineID)
)
-- this will return data need be updated part, you can use it to update by a join.
select * from updatebase_cte where ind=1

Just a quick coding, have not got time to test. Does this work?
Post #1393826
Posted Thursday, December 06, 2012 8:33 PM
SSC Journeyman

SSC JourneymanSSC JourneymanSSC JourneymanSSC JourneymanSSC JourneymanSSC JourneymanSSC JourneymanSSC Journeyman

Group: General Forum Members
Last Login: Monday, July 15, 2013 2:52 PM
Points: 89, Visits: 385
dwain.c & Jeff Moden, your solutions worked great! Thank you both so much!

Jeff - "Data smears" sounds good to me. Regarding the "clear requirements and readily consumable data" comment, I think it's just common courtesy. I mean, if you're asking people to help you, why not make it easier for them to help you.

You two were quick too!

Thanks again!
Post #1393839
Posted Thursday, December 06, 2012 8:45 PM


Hall of Fame

Hall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of Fame

Group: General Forum Members
Last Login: Yesterday @ 6:03 PM
Points: 3,590, Visits: 5,098
You're welcome!

I note that I neglected to mention mine only works because you happened to already have the clustered index needed to order the QU.



My mantra: No loops! No CURSORs! No RBAR! Hoo-uh!

My thought question: Have you ever been told that your query runs too fast?

My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.


Need to UNPIVOT? Why not CROSS APPLY VALUES instead?
Since random numbers are too important to be left to chance, let's generate some!
Learn to understand recursive CTEs by example.
Splitting strings based on patterns can be fast!
Post #1393842
Posted Thursday, December 06, 2012 9:56 PM
SSC Rookie

SSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC Rookie

Group: General Forum Members
Last Login: Today @ 12:26 PM
Points: 25, Visits: 2,093
Here is another possibility. Let me say before hand that Jeff Moden's solution is better than this for the specific task you asked for; (on my laptop, my solution is 85% of the cost, whereas jeff's solution is 15% of the cost)

But in case you want to assign sequential ids to each group and sequential ids to rows within a group, you can consider this approach; With this approach, you can answer, for example, what is the average pressure of the 15th group after 10/24/2012.

; WITH R (MachineId, Pressure, RecordedDate, Rid) AS
(
SELECT MachineId, Pressure, RecordedDate,
ROW_NUMBER() OVER(PARTITION BY MachineId ORDER BY MachineId, RecordedDate) AS Rid
FROM dbo.UpdateGaps
),
GRPROW(MachineId, Pressure, RecordedDate, GroupId, RowId) AS
(
SELECT G1.MachineId, G1.Pressure, G1.RecordedDate,
DENSE_RANK() OVER (PARTITION BY G1.MachineId ORDER BY G1.MachineId, ISNULL(MIN(G2.Rid) - 1, G1.Rid)) AS GroupId,
ROW_NUMBER() OVER (PARTITION BY G1.MachineId, ISNULL(MIN(G2.Rid) - 1, G1.Rid) ORDER BY G1.MachineId, ISNULL(MIN(G2.Rid) - 1, G1.Rid), G1.RecordedDate DESC) AS RowId
FROM R G1
LEFT JOIN R G2 ON (G1.MachineID = G2.MachineID
AND G2.Rid > G1.Rid
AND G1.Pressure <> G2.Pressure)
GROUP BY G1.MachineId, G1.Pressure, G1.RecordedDate, G1.Rid
)
SELECT G1.MachineId, G1.Pressure, G1.RecordedDate, G1.GroupId, G1.RowId AS DescendingRowId,
CASE WHEN ISNULL(G2.Pressure, 0) = 0 THEN G1.Pressure ELSE G2.Pressure END AS NewPressure
FROM GRPROW G1
LEFT JOIN GRPROW G2 ON (G1.MachineId = G2.MachineId
AND G1.GroupId - 1 = G2.GroupId
AND G2.RowId = 1)
ORDER BY G1.MachineId, G1.RecordedDate

Post #1393847
Posted Thursday, December 06, 2012 10:44 PM


SSC-Insane

SSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-Insane

Group: General Forum Members
Last Login: Today @ 3:45 PM
Points: 22,492, Visits: 30,192
Here is another solution to the problem:


CREATE TABLE dbo.UpdateGaps(
MachineID int NOT NULL,
RecordedDate datetime NOT NULL,
Pressure float NULL,
CONSTRAINT PK_UpdateGaps PRIMARY KEY CLUSTERED (MachineID,RecordedDate)
)

INSERT dbo.UpdateGaps (MachineID, RecordedDate, Pressure) VALUES (60, '2012-10-02', 0);
INSERT dbo.UpdateGaps (MachineID, RecordedDate, Pressure) VALUES (60, '2012-10-03', 300);
INSERT dbo.UpdateGaps (MachineID, RecordedDate, Pressure) VALUES (60, '2012-10-04', 300);
INSERT dbo.UpdateGaps (MachineID, RecordedDate, Pressure) VALUES (60, '2012-10-05', 0);
INSERT dbo.UpdateGaps (MachineID, RecordedDate, Pressure) VALUES (60, '2012-10-06', 0);
INSERT dbo.UpdateGaps (MachineID, RecordedDate, Pressure) VALUES (60, '2012-10-07', 0);
INSERT dbo.UpdateGaps (MachineID, RecordedDate, Pressure) VALUES (60, '2012-10-08', 330);
INSERT dbo.UpdateGaps (MachineID, RecordedDate, Pressure) VALUES (60, '2012-10-09', 0);
INSERT dbo.UpdateGaps (MachineID, RecordedDate, Pressure) VALUES (60, '2012-10-10', 300);
INSERT dbo.UpdateGaps (MachineID, RecordedDate, Pressure) VALUES (97, '2012-10-02', 380);
INSERT dbo.UpdateGaps (MachineID, RecordedDate, Pressure) VALUES (97, '2012-10-03', 0);
INSERT dbo.UpdateGaps (MachineID, RecordedDate, Pressure) VALUES (97, '2012-10-04', 0);
INSERT dbo.UpdateGaps (MachineID, RecordedDate, Pressure) VALUES (97, '2012-10-05', 350);
INSERT dbo.UpdateGaps (MachineID, RecordedDate, Pressure) VALUES (97, '2012-10-06', 350);
INSERT dbo.UpdateGaps (MachineID, RecordedDate, Pressure) VALUES (97, '2012-10-07', 0);
INSERT dbo.UpdateGaps (MachineID, RecordedDate, Pressure) VALUES (97, '2012-10-08', 0);
INSERT dbo.UpdateGaps (MachineID, RecordedDate, Pressure) VALUES (97, '2012-10-09', 400);
INSERT dbo.UpdateGaps (MachineID, RecordedDate, Pressure) VALUES (97, '2012-10-10', 0);
GO

with Base0Data as (
select
MachineID,
RecordedDate,
Pressure,
rn = row_number() over (partition by MachineID, case when Pressure = 0 then 0 else 1 end order by RecordedDate)
from
dbo.UpdateGaps
)
,BaseData as (
select
MachineID,
RecordedDate,
Pressure,
GrpDate = dateadd(dd,-rn,RecordedDate),
rn,
rn1 = row_number() over (partition by MachineID, dateadd(dd,-rn,RecordedDate) order by dateadd(dd,-rn,RecordedDate))
from
Base0Data
)
--select * from BaseData where Pressure = 0
select
bd1.MachineID,
bd1.RecordedDate,
Pressure = case when bd1.Pressure = 0 then coalesce(bd2.Pressure, bd1.Pressure) else bd1.Pressure end
from
BaseData bd1
left outer join BaseData bd2
on (bd1.MachineID = bd2.MachineID
and dateadd(dd, -bd1.rn1, bd1.RecordedDate) = bd2.RecordedDate)
order by
bd1.MachineID,
bd1.RecordedDate;
go

drop table dbo.UpdateGaps;
go





Lynn Pettis

For better assistance in answering your questions, click here
For tips to get better help with Performance Problems, click here
For Running Totals and its variations, click here or when working with partitioned tables
For more about Tally Tables, click here
For more about Cross Tabs and Pivots, click here and here
Managing Transaction Logs

SQL Musings from the Desert Fountain Valley SQL (My Mirror Blog)
Post #1393853
Posted Thursday, December 06, 2012 11:23 PM


SSC-Dedicated

SSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-Dedicated

Group: General Forum Members
Last Login: Today @ 3:21 PM
Points: 35,958, Visits: 30,249
dwain.c (12/6/2012)
Jeff Moden (12/6/2012)

I don't believe the following won't be quite as fast as a Quirky Update...


Easy for you to say...


Heh... suffering from a NEC (Not Enough Coffee) problem on that one, for sure. I meant to say that "I don't believe that the following will be quite as fast as a Quirky Update" but got my tangle all toungled up.


--Jeff Moden
"RBAR is pronounced "ree-bar" and is a "Modenism" for "Row-By-Agonizing-Row".

First step towards the paradigm shift of writing Set Based code:
Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column."

"Change is inevitable. Change for the better is not." -- 04 August 2013
(play on words) "Just because you CAN do something in T-SQL, doesn't mean you SHOULDN'T." --22 Aug 2013

Helpful Links:
How to post code problems
How to post performance problems
Post #1393869
« Prev Topic | Next Topic »

Add to briefcase 123»»»

Permissions Expand / Collapse