SQL Clone
SQLServerCentral is supported by Redgate
 
Log in  ::  Register  ::  Not logged in
 
 
 


Update Gaps


Update Gaps

Author
Message
oralinque
oralinque
SSC-Enthusiastic
SSC-Enthusiastic (115 reputation)SSC-Enthusiastic (115 reputation)SSC-Enthusiastic (115 reputation)SSC-Enthusiastic (115 reputation)SSC-Enthusiastic (115 reputation)SSC-Enthusiastic (115 reputation)SSC-Enthusiastic (115 reputation)SSC-Enthusiastic (115 reputation)

Group: General Forum Members
Points: 115 Visits: 441
I'm trying to update gaps in a table using a set based approach, but have been struggling with the solution. I've read a few gaps and islands posts in this forum, and was hopeful that I could figure it out, but I'm at a loss. Here's the sql to create the table and populate it with data.

CREATE TABLE dbo.UpdateGaps(
MachineID int NOT NULL,
RecordedDate datetime NOT NULL,
Pressure float NULL,
CONSTRAINT PK_UpdateGaps PRIMARY KEY CLUSTERED (MachineID,RecordedDate)
)

INSERT dbo.UpdateGaps (MachineID, RecordedDate, Pressure) VALUES (60, '2012-10-02', 0);
INSERT dbo.UpdateGaps (MachineID, RecordedDate, Pressure) VALUES (60, '2012-10-03', 300);
INSERT dbo.UpdateGaps (MachineID, RecordedDate, Pressure) VALUES (60, '2012-10-04', 300);
INSERT dbo.UpdateGaps (MachineID, RecordedDate, Pressure) VALUES (60, '2012-10-05', 0);
INSERT dbo.UpdateGaps (MachineID, RecordedDate, Pressure) VALUES (60, '2012-10-06', 0);
INSERT dbo.UpdateGaps (MachineID, RecordedDate, Pressure) VALUES (60, '2012-10-07', 0);
INSERT dbo.UpdateGaps (MachineID, RecordedDate, Pressure) VALUES (60, '2012-10-08', 330);
INSERT dbo.UpdateGaps (MachineID, RecordedDate, Pressure) VALUES (60, '2012-10-09', 0);
INSERT dbo.UpdateGaps (MachineID, RecordedDate, Pressure) VALUES (60, '2012-10-10', 300);
INSERT dbo.UpdateGaps (MachineID, RecordedDate, Pressure) VALUES (97, '2012-10-02', 380);
INSERT dbo.UpdateGaps (MachineID, RecordedDate, Pressure) VALUES (97, '2012-10-03', 0);
INSERT dbo.UpdateGaps (MachineID, RecordedDate, Pressure) VALUES (97, '2012-10-04', 0);
INSERT dbo.UpdateGaps (MachineID, RecordedDate, Pressure) VALUES (97, '2012-10-05', 350);
INSERT dbo.UpdateGaps (MachineID, RecordedDate, Pressure) VALUES (97, '2012-10-06', 350);
INSERT dbo.UpdateGaps (MachineID, RecordedDate, Pressure) VALUES (97, '2012-10-07', 0);
INSERT dbo.UpdateGaps (MachineID, RecordedDate, Pressure) VALUES (97, '2012-10-08', 0);
INSERT dbo.UpdateGaps (MachineID, RecordedDate, Pressure) VALUES (97, '2012-10-09', 400);
INSERT dbo.UpdateGaps (MachineID, RecordedDate, Pressure) VALUES (97, '2012-10-10', 0);
GO



What I need to do is fill in the zero values with the value that precedes it (grouping by MachineID). Non-zero values do not change.


MachineID RecordedDate Pressure
--------- ------------ --------
60 2012-10-02 0 <-- no change (nothing precedes this record)
60 2012-10-03 300 <-- no change
60 2012-10-04 300 <-- no change
60 2012-10-05 0 <-- update to 300
60 2012-10-06 0 <-- update to 300
60 2012-10-07 0 <-- update to 300
60 2012-10-08 330 <-- no change
60 2012-10-09 0 <-- update to 330
60 2012-10-10 300 <-- no change
97 2012-10-02 380 <-- no change (new MachineID)
97 2012-10-03 0 <-- update to 380
97 2012-10-04 0 <-- update to 380
97 2012-10-05 350 <-- no change
97 2012-10-06 350 <-- no change
97 2012-10-07 0 <-- update to 350
97 2012-10-08 0 <-- update to 350
97 2012-10-09 400 <-- no change
97 2012-10-10 0 <-- update to 400



I read this yesterday, http://www.manning.com/nielsen/nielsenMEAP_freeCh5.pdf, which gave me hope, but I still can't figure out how to update the table without using a cursor. Using the information in the PDF, I created the query below which gives a date range for the gaps. This is where I was hopeful that I could figure it out, but I still keep falling back to a cursor based approach.

WITH C AS
(
SELECT MachineID, RecordedDate, Pressure,
ROW_NUMBER() OVER(ORDER BY MachineID, RecordedDate)
- ROW_NUMBER() OVER(ORDER BY Pressure,MachineID,RecordedDate) AS grp
FROM dbo.UpdateGaps
)
SELECT MachineID, MIN(RecordedDate) AS mn, MAX(RecordedDate) AS mx, Pressure
FROM C
GROUP BY MachineID, Pressure, grp
ORDER BY MachineID, mn;



If anyone has done this sort of thing before, I would greatly appreciate any help you have to offer. And thank you, capnhector, for linking to the PDF in the thread, "Solve Problems Using Recursive CTE". That's a great source of information on gaps and islands. Thanks to Itzik Ben-Gan for writing it.
dwain.c
dwain.c
SSCertifiable
SSCertifiable (7.5K reputation)SSCertifiable (7.5K reputation)SSCertifiable (7.5K reputation)SSCertifiable (7.5K reputation)SSCertifiable (7.5K reputation)SSCertifiable (7.5K reputation)SSCertifiable (7.5K reputation)SSCertifiable (7.5K reputation)

Group: General Forum Members
Points: 7527 Visits: 6431
You can try a quirky update (QU).


DECLARE @MachineID INT = 0
,@Pressure FLOAT = 0

UPDATE u WITH(TABLOCKX)
SET Pressure = CASE WHEN Pressure = 0
THEN @Pressure ELSE Pressure END
,@Pressure = CASE WHEN Pressure = 0 AND @MachineID = MachineID
THEN @Pressure ELSE Pressure END
,@MachineID = MachineID
FROM dbo.UpdateGaps u
OPTION(MAXDOP 1)




Edit: Ooops! Initially forgot to account for change in Machine ID.


My mantra: No loops! No CURSORs! No RBAR! Hoo-uh!

My thought question: Have you ever been told that your query runs too fast?

My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.


Need to UNPIVOT? Why not CROSS APPLY VALUES instead?
Since random numbers are too important to be left to chance, let's generate some!
Learn to understand recursive CTEs by example.
Splitting strings based on patterns can be fast!
My temporal SQL musings: Calendar Tables, an Easter SQL, Time Slots and Self-maintaining, Contiguous Effective Dates in Temporal Tables
Jeff Moden
Jeff Moden
SSC Guru
SSC Guru (88K reputation)SSC Guru (88K reputation)SSC Guru (88K reputation)SSC Guru (88K reputation)SSC Guru (88K reputation)SSC Guru (88K reputation)SSC Guru (88K reputation)SSC Guru (88K reputation)

Group: General Forum Members
Points: 88950 Visits: 41136
I don't know what others call this type of problem but I call them "data smears" because it's like you smearing strips of color down from one wet paint spot to another.

I don't believe the following will be quite as fast as a Quirky Update, but it'll blow the doors off a cursor especially in the face of the correct index. And thank you VERY much for posting such clear requirements and readily consumable data!


UPDATE ug
SET Pressure = ca.Pressure
FROM dbo.UpdateGaps ug
CROSS APPLY (SELECT TOP 1 Pressure
FROM dbo.UpdateGaps ugca
WHERE ugca.MachineID = ug.MachineID
AND ugca.RecordedDate <= ug.RecordedDate
AND ugca.Pressure > 0
ORDER BY ugca.MachineID,ugca.RecordedDate DESC) ca
;



--Jeff Moden

RBAR is pronounced ree-bar and is a Modenism for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column.
If you think its expensive to hire a professional to do the job, wait until you hire an amateur. -- Red Adair

Helpful Links:
How to post code problems
How to post performance problems
Forum FAQs
dwain.c
dwain.c
SSCertifiable
SSCertifiable (7.5K reputation)SSCertifiable (7.5K reputation)SSCertifiable (7.5K reputation)SSCertifiable (7.5K reputation)SSCertifiable (7.5K reputation)SSCertifiable (7.5K reputation)SSCertifiable (7.5K reputation)SSCertifiable (7.5K reputation)

Group: General Forum Members
Points: 7527 Visits: 6431
Jeff Moden (12/6/2012)

I don't believe the following won't be quite as fast as a Quirky Update...


Easy for you to say...


My mantra: No loops! No CURSORs! No RBAR! Hoo-uh!

My thought question: Have you ever been told that your query runs too fast?

My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.


Need to UNPIVOT? Why not CROSS APPLY VALUES instead?
Since random numbers are too important to be left to chance, let's generate some!
Learn to understand recursive CTEs by example.
Splitting strings based on patterns can be fast!
My temporal SQL musings: Calendar Tables, an Easter SQL, Time Slots and Self-maintaining, Contiguous Effective Dates in Temporal Tables
bj_shenglong
bj_shenglong
Forum Newbie
Forum Newbie (8 reputation)Forum Newbie (8 reputation)Forum Newbie (8 reputation)Forum Newbie (8 reputation)Forum Newbie (8 reputation)Forum Newbie (8 reputation)Forum Newbie (8 reputation)Forum Newbie (8 reputation)

Group: General Forum Members
Points: 8 Visits: 63
with updatebase_cte (MachineID,RecordedDate,Pressure,ind) as (
select s.MachineID,s.RecordedDate,s.Pressure, 0 as ind
from yourtable s
where s.RecordedDate='2012-10-02'
union all
select s.MachineID,s.RecordedDate,
case when s.Pressure = 0 and sc.Pressure <> 0 then sc.Pressure else s.Pressure end as Pressure,
case when s.Pressure = 0 and sc.Pressure <> 0 then 1 else 0 end as ind
from yourtable s
inner join updatebase_cte sc
on (s.RecordedDate = dateadd(day,1,sc.RecordedDate) and s.MachineID=sc.MachineID)
)
-- this will return data need be updated part, you can use it to update by a join.
select * from updatebase_cte where ind=1

Just a quick coding, have not got time to test. Does this work?:-)
oralinque
oralinque
SSC-Enthusiastic
SSC-Enthusiastic (115 reputation)SSC-Enthusiastic (115 reputation)SSC-Enthusiastic (115 reputation)SSC-Enthusiastic (115 reputation)SSC-Enthusiastic (115 reputation)SSC-Enthusiastic (115 reputation)SSC-Enthusiastic (115 reputation)SSC-Enthusiastic (115 reputation)

Group: General Forum Members
Points: 115 Visits: 441
dwain.c & Jeff Moden, your solutions worked great! Thank you both so much!

Jeff - "Data smears" sounds good to me. Regarding the "clear requirements and readily consumable data" comment, I think it's just common courtesy. I mean, if you're asking people to help you, why not make it easier for them to help you.

You two were quick too!

Thanks again!
dwain.c
dwain.c
SSCertifiable
SSCertifiable (7.5K reputation)SSCertifiable (7.5K reputation)SSCertifiable (7.5K reputation)SSCertifiable (7.5K reputation)SSCertifiable (7.5K reputation)SSCertifiable (7.5K reputation)SSCertifiable (7.5K reputation)SSCertifiable (7.5K reputation)

Group: General Forum Members
Points: 7527 Visits: 6431
You're welcome!

I note that I neglected to mention mine only works because you happened to already have the clustered index needed to order the QU.


My mantra: No loops! No CURSORs! No RBAR! Hoo-uh!

My thought question: Have you ever been told that your query runs too fast?

My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.


Need to UNPIVOT? Why not CROSS APPLY VALUES instead?
Since random numbers are too important to be left to chance, let's generate some!
Learn to understand recursive CTEs by example.
Splitting strings based on patterns can be fast!
My temporal SQL musings: Calendar Tables, an Easter SQL, Time Slots and Self-maintaining, Contiguous Effective Dates in Temporal Tables
Sam S Kolli
Sam S Kolli
SSC Journeyman
SSC Journeyman (79 reputation)SSC Journeyman (79 reputation)SSC Journeyman (79 reputation)SSC Journeyman (79 reputation)SSC Journeyman (79 reputation)SSC Journeyman (79 reputation)SSC Journeyman (79 reputation)SSC Journeyman (79 reputation)

Group: General Forum Members
Points: 79 Visits: 3486
Here is another possibility. Let me say before hand that Jeff Moden's solution is better than this for the specific task you asked for; (on my laptop, my solution is 85% of the cost, whereas jeff's solution is 15% of the cost)

But in case you want to assign sequential ids to each group and sequential ids to rows within a group, you can consider this approach; With this approach, you can answer, for example, what is the average pressure of the 15th group after 10/24/2012.


; WITH R (MachineId, Pressure, RecordedDate, Rid) AS
(
SELECT MachineId, Pressure, RecordedDate,
ROW_NUMBER() OVER(PARTITION BY MachineId ORDER BY MachineId, RecordedDate) AS Rid
FROM dbo.UpdateGaps
),
GRPROW(MachineId, Pressure, RecordedDate, GroupId, RowId) AS
(
SELECT G1.MachineId, G1.Pressure, G1.RecordedDate,
DENSE_RANK() OVER (PARTITION BY G1.MachineId ORDER BY G1.MachineId, ISNULL(MIN(G2.Rid) - 1, G1.Rid)) AS GroupId,
ROW_NUMBER() OVER (PARTITION BY G1.MachineId, ISNULL(MIN(G2.Rid) - 1, G1.Rid) ORDER BY G1.MachineId, ISNULL(MIN(G2.Rid) - 1, G1.Rid), G1.RecordedDate DESC) AS RowId
FROM R G1
LEFT JOIN R G2 ON (G1.MachineID = G2.MachineID
AND G2.Rid > G1.Rid
AND G1.Pressure <> G2.Pressure)
GROUP BY G1.MachineId, G1.Pressure, G1.RecordedDate, G1.Rid
)
SELECT G1.MachineId, G1.Pressure, G1.RecordedDate, G1.GroupId, G1.RowId AS DescendingRowId,
CASE WHEN ISNULL(G2.Pressure, 0) = 0 THEN G1.Pressure ELSE G2.Pressure END AS NewPressure
FROM GRPROW G1
LEFT JOIN GRPROW G2 ON (G1.MachineId = G2.MachineId
AND G1.GroupId - 1 = G2.GroupId
AND G2.RowId = 1)
ORDER BY G1.MachineId, G1.RecordedDate


Lynn Pettis
Lynn Pettis
SSC-Forever
SSC-Forever (40K reputation)SSC-Forever (40K reputation)SSC-Forever (40K reputation)SSC-Forever (40K reputation)SSC-Forever (40K reputation)SSC-Forever (40K reputation)SSC-Forever (40K reputation)SSC-Forever (40K reputation)

Group: General Forum Members
Points: 40598 Visits: 38567
Here is another solution to the problem:



CREATE TABLE dbo.UpdateGaps(
MachineID int NOT NULL,
RecordedDate datetime NOT NULL,
Pressure float NULL,
CONSTRAINT PK_UpdateGaps PRIMARY KEY CLUSTERED (MachineID,RecordedDate)
)

INSERT dbo.UpdateGaps (MachineID, RecordedDate, Pressure) VALUES (60, '2012-10-02', 0);
INSERT dbo.UpdateGaps (MachineID, RecordedDate, Pressure) VALUES (60, '2012-10-03', 300);
INSERT dbo.UpdateGaps (MachineID, RecordedDate, Pressure) VALUES (60, '2012-10-04', 300);
INSERT dbo.UpdateGaps (MachineID, RecordedDate, Pressure) VALUES (60, '2012-10-05', 0);
INSERT dbo.UpdateGaps (MachineID, RecordedDate, Pressure) VALUES (60, '2012-10-06', 0);
INSERT dbo.UpdateGaps (MachineID, RecordedDate, Pressure) VALUES (60, '2012-10-07', 0);
INSERT dbo.UpdateGaps (MachineID, RecordedDate, Pressure) VALUES (60, '2012-10-08', 330);
INSERT dbo.UpdateGaps (MachineID, RecordedDate, Pressure) VALUES (60, '2012-10-09', 0);
INSERT dbo.UpdateGaps (MachineID, RecordedDate, Pressure) VALUES (60, '2012-10-10', 300);
INSERT dbo.UpdateGaps (MachineID, RecordedDate, Pressure) VALUES (97, '2012-10-02', 380);
INSERT dbo.UpdateGaps (MachineID, RecordedDate, Pressure) VALUES (97, '2012-10-03', 0);
INSERT dbo.UpdateGaps (MachineID, RecordedDate, Pressure) VALUES (97, '2012-10-04', 0);
INSERT dbo.UpdateGaps (MachineID, RecordedDate, Pressure) VALUES (97, '2012-10-05', 350);
INSERT dbo.UpdateGaps (MachineID, RecordedDate, Pressure) VALUES (97, '2012-10-06', 350);
INSERT dbo.UpdateGaps (MachineID, RecordedDate, Pressure) VALUES (97, '2012-10-07', 0);
INSERT dbo.UpdateGaps (MachineID, RecordedDate, Pressure) VALUES (97, '2012-10-08', 0);
INSERT dbo.UpdateGaps (MachineID, RecordedDate, Pressure) VALUES (97, '2012-10-09', 400);
INSERT dbo.UpdateGaps (MachineID, RecordedDate, Pressure) VALUES (97, '2012-10-10', 0);
GO

with Base0Data as (
select
MachineID,
RecordedDate,
Pressure,
rn = row_number() over (partition by MachineID, case when Pressure = 0 then 0 else 1 end order by RecordedDate)
from
dbo.UpdateGaps
)
,BaseData as (
select
MachineID,
RecordedDate,
Pressure,
GrpDate = dateadd(dd,-rn,RecordedDate),
rn,
rn1 = row_number() over (partition by MachineID, dateadd(dd,-rn,RecordedDate) order by dateadd(dd,-rn,RecordedDate))
from
Base0Data
)
--select * from BaseData where Pressure = 0
select
bd1.MachineID,
bd1.RecordedDate,
Pressure = case when bd1.Pressure = 0 then coalesce(bd2.Pressure, bd1.Pressure) else bd1.Pressure end
from
BaseData bd1
left outer join BaseData bd2
on (bd1.MachineID = bd2.MachineID
and dateadd(dd, -bd1.rn1, bd1.RecordedDate) = bd2.RecordedDate)
order by
bd1.MachineID,
bd1.RecordedDate;
go

drop table dbo.UpdateGaps;
go




Cool
Lynn Pettis

For better assistance in answering your questions, click here
For tips to get better help with Performance Problems, click here
For Running Totals and its variations, click here or when working with partitioned tables
For more about Tally Tables, click here
For more about Cross Tabs and Pivots, click here and here
Managing Transaction Logs

SQL Musings from the Desert Fountain Valley SQL (My Mirror Blog)
Jeff Moden
Jeff Moden
SSC Guru
SSC Guru (88K reputation)SSC Guru (88K reputation)SSC Guru (88K reputation)SSC Guru (88K reputation)SSC Guru (88K reputation)SSC Guru (88K reputation)SSC Guru (88K reputation)SSC Guru (88K reputation)

Group: General Forum Members
Points: 88950 Visits: 41136
dwain.c (12/6/2012)
Jeff Moden (12/6/2012)

I don't believe the following won't be quite as fast as a Quirky Update...


Easy for you to say...


Heh... suffering from a NEC (Not Enough Coffee) problem on that one, for sure. :-P I meant to say that "I don't believe that the following will be quite as fast as a Quirky Update" but got my tangle all toungled up. Hehe

--Jeff Moden

RBAR is pronounced ree-bar and is a Modenism for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column.
If you think its expensive to hire a professional to do the job, wait until you hire an amateur. -- Red Adair

Helpful Links:
How to post code problems
How to post performance problems
Forum FAQs
Go


Permissions

You can't post new topics.
You can't post topic replies.
You can't post new polls.
You can't post replies to polls.
You can't edit your own topics.
You can't delete your own topics.
You can't edit other topics.
You can't delete other topics.
You can't edit your own posts.
You can't edit other posts.
You can't delete your own posts.
You can't delete other posts.
You can't post events.
You can't edit your own events.
You can't edit other events.
You can't delete your own events.
You can't delete other events.
You can't send private messages.
You can't send emails.
You can read topics.
You can't vote in polls.
You can't upload attachments.
You can download attachments.
You can't post HTML code.
You can't edit HTML code.
You can't post IFCode.
You can't post JavaScript.
You can post emoticons.
You can't post or upload images.

Select a forum

































































































































































SQLServerCentral


Search