Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 
        
Home       Members    Calendar    Who's On


Add to briefcase 123»»»

Help finding an old post Expand / Collapse
Author
Message
Posted Thursday, January 30, 2014 6:04 AM


SSC-Addicted

SSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-Addicted

Group: General Forum Members
Last Login: Today @ 9:38 AM
Points: 486, Visits: 1,976
Hi,

A while back I recall reading a post on here where a user needed to update a column with values until there was a new value, then update with that new value until it hit the next new value, and so on.

There were some really excellent solutions, and now that I'm facing a similar problem, I'd really like to review them. But of course I can't find it.

I know it's sort of a gaps and islands problem generally.

Thanks
Post #1536280
Posted Thursday, January 30, 2014 6:18 PM


SSC-Addicted

SSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-Addicted

Group: General Forum Members
Last Login: Today @ 9:38 AM
Points: 486, Visits: 1,976
I'll take that as a no

Anyway, here's what I'm generally trying to do:

CREATE TABLE testsql (
cid integer,
ctime datetime,
cgm integer);

INSERT INTO testsql VALUES
('1', '2012-01-01 07:00', '1'),
('1', '2012-01-01 07:01', '1'),
('1', '2012-01-01 07:02', '1'),
('1', '2012-01-01 07:03', NULL),
('1', '2012-01-01 07:04', NULL),
('1', '2012-01-01 07:05', '5'),
('1', '2012-01-01 07:06', '5'),
('1', '2012-01-01 07:07', '5'),
('1', '2012-01-01 07:08', NULL),
('1', '2012-01-01 07:09', NULL),
('1', '2012-01-01 07:10', NULL),
('1', '2012-01-01 07:11', NULL),
('2', '2013-05-02 07:12', '24'),
('2', '2013-05-02 07:13', '24'),
('2', '2013-05-02 07:14', '24'),
('2', '2013-05-02 07:15', NULL),
('2', '2013-05-02 07:16', NULL),
('2', '2013-05-02 07:17', '44'),
('2', '2013-05-02 07:18', '44'),
('2', '2013-05-02 07:19', NULL),
('2', '2013-05-02 07:20', '4'),
('2', '2013-05-02 07:21', '4'),
('2', '2013-05-02 07:22', '4');

And here's the query

WITH C (id, pt, gm, dr) as (
SELECT cid, ctime, cgm,
dense_rank() over (partition by cid, cgm order by cid, ctime) as dr
FROM testsql
)
SELECT id, pt, gm,
CASE WHEN gm IS NULL THEN
LAG(gm, cast(c.DR as int)) OVER (PARTITION BY id ORDER BY id, pt)
ELSE gm END as gm2
FROM C
ORDER BY id, pt, gm

There's one result towards the end that is still NULL, rather than being filled in with a LAG value, like the rest in the CASE column are.

Any suggestions?

Post #1536612
Posted Thursday, January 30, 2014 8:20 PM


SSCertifiable

SSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiable

Group: General Forum Members
Last Login: Thursday, November 6, 2014 1:00 PM
Points: 5,333, Visits: 25,277
Could this be the article:

Group Islands of Contiguous Dates (SQL Spackle)
By Jeff Moden, 2013/10/04 (first published: 2010/12/07)

http://www.sqlservercentral.com/articles/T-SQL/71550/


If everything seems to be going well, you have obviously overlooked something.

Ron

Please help us, help you -before posting a question please read

Before posting a performance problem please read
Post #1536621
Posted Friday, January 31, 2014 9:21 AM
Mr or Mrs. 500

Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500

Group: General Forum Members
Last Login: Today @ 6:00 AM
Points: 542, Visits: 2,136
sqldriver,
Could your data also contain a sequence like the following, i.e. cgm changes without one or more rows having cgm null in between:

('1', '2012-01-01 07:05', '5'),
('1', '2012-01-01 07:06', '5'),
('1', '2012-01-01 07:07', '5'),
('2', '2013-05-02 07:10', '24'),
('2', '2013-05-02 07:20', '24'),
('2', '2013-05-02 07:30', '24')

Could there be gaps in the ctime values (as shown above)?
What version of SQL Server must this run under?
Post #1536842
Posted Friday, January 31, 2014 9:24 AM


SSC-Addicted

SSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-Addicted

Group: General Forum Members
Last Login: Today @ 9:38 AM
Points: 486, Visits: 1,976
bitbucket-25253 (1/30/2014)
Could this be the article:

Group Islands of Contiguous Dates (SQL Spackle)
By Jeff Moden, 2013/10/04 (first published: 2010/12/07)

http://www.sqlservercentral.com/articles/T-SQL/71550/


That's a fine article, but this was a forum post. There were a number of pages of replies, but maybe it only seemed memorable to me!
Post #1536845
Posted Friday, January 31, 2014 9:27 AM


SSC-Addicted

SSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-Addicted

Group: General Forum Members
Last Login: Today @ 9:38 AM
Points: 486, Visits: 1,976
Michael Meierruth (1/31/2014)
sqldriver,
Could your data also contain a sequence like the following, i.e. cgm changes without one or more rows having cgm null in between:

('1', '2012-01-01 07:05', '5'),
('1', '2012-01-01 07:06', '5'),
('1', '2012-01-01 07:07', '5'),
('2', '2013-05-02 07:10', '24'),
('2', '2013-05-02 07:20', '24'),
('2', '2013-05-02 07:30', '24')

Could there be gaps in the ctime values (as shown above)?
What version of SQL Server must this run under?


Hi,

No, the issue with the data is that there are NULL gaps in the 'check in' column all the way through. I haven't seen a contiguous block. It will be running on 2012.

Thanks




Post #1536847
Posted Friday, January 31, 2014 10:44 AM
Mr or Mrs. 500

Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500

Group: General Forum Members
Last Login: Today @ 6:00 AM
Points: 542, Visits: 2,136
OK, just saw your use of LAG which means SS20012.
But you're making completely wrong use of it.
If you delete one of the rows with cgm='5' from your data and run your script the results become a lot worse.
In your final select display the dr column and you'll see this right away.
Also, no dense_rank needed here. Rank is sufficient.
Post #1536892
Posted Friday, January 31, 2014 12:37 PM


SSC-Addicted

SSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-Addicted

Group: General Forum Members
Last Login: Today @ 9:38 AM
Points: 486, Visits: 1,976
Michael Meierruth (1/31/2014)
OK, just saw your use of LAG which means SS20012.
But you're making completely wrong use of it.
If you delete one of the rows with cgm='5' from your data and run your script the results become a lot worse.
In your final select display the dr column and you'll see this right away.
Also, no dense_rank needed here. Rank is sufficient.


It's not completely wrong - it's getting me all the test data except one row how I want it. Do you have another suggestion?

Switching from dense rank to rank doesn't change output, so I'm not sure what you mean.

Thanks
Post #1536935
Posted Saturday, February 1, 2014 5:35 AM
Mr or Mrs. 500

Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500

Group: General Forum Members
Last Login: Today @ 6:00 AM
Points: 542, Visits: 2,136
sqldriver,

You didn't understand my remark regarding removing one of the rows with cgm='5'.
If you do that your solution will now show even more incorrect results.

I understand what you are trying to do but unfortunately the rank values being generated make your approach invalid.
I can't see a way a to generate the rank values the way you would like them.

Thus here is my solution which takes a more classic approach in dealing with arbitrarily embedded groups.
It also will handle some strange cases.
Thus a more generic problem definition is:
within every cid group, update every group of nulls with the cgm value of the row immediately before the null group (if it exists)

Here is the script:

with

cte1 as -- change cgm nulls to 0 to avoid confusion with other types of nulls later on
(
select cid,ctime,isnull(cgm,0) cgm
from testsql
),

cte2 as -- find cid,cgm values before and after each row (-1 is returned for the first and last row)
(
select cid,ctime,cgm,
lag(cgm,1,-1) over (order by ctime) cgmbef,lead(cgm,1,-1) over (order by ctime) cgmaft,
lag(cid,1,-1) over (order by ctime) cidbef,lead(cid,1,-1) over (order by ctime) cidaft
from cte1
),

cte3 as -- find first and last element of each null group
(
select cid,ctime,cgmbef,cgmaft,
case when cid<>cidbef or cgm<>cgmbef then 1 else 0 end isfirst,
case when cid<>cidaft or cgm<>cgmaft then 1 else 0 end islast
from cte2
where cgm=0
),

cte4 as -- extract only the beginning of each group
(
select cid,ctime,cgmbef,rank() over(partition by cid order by ctime) nullgroupid
from cte3
where isfirst=1
),

cte5 as -- extract only then end of each group
(
select cid,ctime,cgmbef,rank() over(partition by cid order by ctime) nullgroupid
from cte3
where islast=1
)

-- merge the two
select a.cid,a.ctime starttime,b.ctime endtime,a.cgmbef
from cte4 a
join cte5 b on a.cid=b.cid and a.nullgroupid=b.nullgroupid
where a.cgmbef>0

And instead of the final select you could do an update (which accepts the preceeding with block):

update t
set t.cgm=u.cgmbef
from testsql t
join
(
select a.cid,a.ctime starttime,b.ctime endtime,a.cgmbef
from cte4 a
join cte5 b on a.cid=b.cid and a.nullgroupid=b.nullgroupid
where a.cgmbef>0
) u on t.cid=u.cid and t.ctime>=starttime and t.ctime<=endtime

Post #1537068
Posted Saturday, February 1, 2014 11:46 AM


SSC-Dedicated

SSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-Dedicated

Group: General Forum Members
Last Login: Today @ 9:34 AM
Points: 35,532, Visits: 32,115
sqldriver (1/30/2014)
Hi,

A while back I recall reading a post on here where a user needed to update a column with values until there was a new value, then update with that new value until it hit the next new value, and so on.

There were some really excellent solutions, and now that I'm facing a similar problem, I'd really like to review them. But of course I can't find it.

I know it's sort of a gaps and islands problem generally.

Thanks


Was this it?
http://www.sqlservercentral.com/Forums/Topic1529984-8-1.aspx


--Jeff Moden
"RBAR is pronounced "ree-bar" and is a "Modenism" for "Row-By-Agonizing-Row".

First step towards the paradigm shift of writing Set Based code:
Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column."

(play on words) "Just because you CAN do something in T-SQL, doesn't mean you SHOULDN'T." --22 Aug 2013

Helpful Links:
How to post code problems
How to post performance problems
Post #1537084
« Prev Topic | Next Topic »

Add to briefcase 123»»»

Permissions Expand / Collapse