SQL Clone
SQLServerCentral is supported by Redgate
 
Log in  ::  Register  ::  Not logged in
 
 
 


Compare Dates in an iterative manner for a particular set of records


Compare Dates in an iterative manner for a particular set of records

Author
Message
Carla Wilson-484785
Carla Wilson-484785
SSCrazy
SSCrazy (2.3K reputation)SSCrazy (2.3K reputation)SSCrazy (2.3K reputation)SSCrazy (2.3K reputation)SSCrazy (2.3K reputation)SSCrazy (2.3K reputation)SSCrazy (2.3K reputation)SSCrazy (2.3K reputation)

Group: General Forum Members
Points: 2290 Visits: 1951
dfarran (11/13/2008)
It is also possible to do the query with just left joins,


SELECT a.PersonID, a.Version , a.dedate,
[is improper] = CASE WHEN MAX(c.version) IS NOT NULL THEN 1
ELSE 0 END
FROM personrecord a
LEFT JOIN personrecord b ON b.personid = a.personid
LEFT JOIN personrecord c ON c.personid = a.personid
AND c.version >= b.version
AND c.dedate < b.dedate
GROUP BY a.PersonID, a.Version, a.dedate



with the primary key on the table (personid, version) this performs slightly better than the cross join with both the small(8) and medium(circa .75m) rowcount.

I would avoid using the version = version -1 because this does not cater for missing versions (try doubling the version numbers in test data)

Good point about not relying on version = version-1!
Carla Wilson-484785
Carla Wilson-484785
SSCrazy
SSCrazy (2.3K reputation)SSCrazy (2.3K reputation)SSCrazy (2.3K reputation)SSCrazy (2.3K reputation)SSCrazy (2.3K reputation)SSCrazy (2.3K reputation)SSCrazy (2.3K reputation)SSCrazy (2.3K reputation)

Group: General Forum Members
Points: 2290 Visits: 1951
Peter M. (11/12/2008)
I'd be more interested in seeing the execution times where the data size is non-trivial. My suspicion would lead me to believe the row_number() solution might scale better based on my own experience using the related dense_rank() function.

You could even use two dense rank functions that are partitioned identically but have an order by that is based on the version number in one and the date field in the other, then using that to compare when the resulting ranks are not equal. It would require a derived table joined to the original table, but you could get the results you are looking for.


and thanks, Jeff Moden, for pointing out the use of the clustered index on PersonId, Version.

I was interested to see how the different solutions would scale, so I expanded the PersonRecord table to 1 million PersonIDs, and expanded the rows of data to 5 records per PersonId, using Jeff's code. And I added the clustered index.

And I added dfarran's query.

Here are the results:
Original Query 153,733 Milliseconds duration

James Goodwin -
Left join on inner join 136,703 Milliseconds duration

Matt C -
SQL2005 row_number()
over (partition 184,280 Milliseconds duration

Adam Haines -
inner joins and CASE 138,906 Milliseconds duration

dfarrans -
just left joins,
no version-1 245,016 Milliseconds duration

Granted, these are total execution times, so include I/O.

Also, I did not alter the queries that used version-1, which is a weakness if the version sequence is not perfect.
AlexT-359331
AlexT-359331
SSC-Enthusiastic
SSC-Enthusiastic (175 reputation)SSC-Enthusiastic (175 reputation)SSC-Enthusiastic (175 reputation)SSC-Enthusiastic (175 reputation)SSC-Enthusiastic (175 reputation)SSC-Enthusiastic (175 reputation)SSC-Enthusiastic (175 reputation)SSC-Enthusiastic (175 reputation)

Group: General Forum Members
Points: 175 Visits: 596
Am I missing something ?

select a.PersonID,a.Version, a.DEDate , 1 AS [Is ImProper]
from PersonRecord a
inner join PersonRecord b on a.PersonID = b.PersonID and a.Version>b.Version and a.DEDate<b.DEDate
Carla Wilson-484785
Carla Wilson-484785
SSCrazy
SSCrazy (2.3K reputation)SSCrazy (2.3K reputation)SSCrazy (2.3K reputation)SSCrazy (2.3K reputation)SSCrazy (2.3K reputation)SSCrazy (2.3K reputation)SSCrazy (2.3K reputation)SSCrazy (2.3K reputation)

Group: General Forum Members
Points: 2290 Visits: 1951
AlexT (11/13/2008)
Am I missing something ?

select a.PersonID,a.Version, a.DEDate , 1 AS [Is ImProper]
from PersonRecord a
inner join PersonRecord b on a.PersonID = b.PersonID and a.Version>b.Version and a.DEDate<b.DEDate


Yes, you then need to apply this flag to all the records for that PersonId. (So, if a PersonId has 5 rows and 2 of them are out of order, then all 5 rows will have the ImProper = 1, and those PersonIds with no rows out of order will have ImProper = 0.)
AlexT-359331
AlexT-359331
SSC-Enthusiastic
SSC-Enthusiastic (175 reputation)SSC-Enthusiastic (175 reputation)SSC-Enthusiastic (175 reputation)SSC-Enthusiastic (175 reputation)SSC-Enthusiastic (175 reputation)SSC-Enthusiastic (175 reputation)SSC-Enthusiastic (175 reputation)SSC-Enthusiastic (175 reputation)

Group: General Forum Members
Points: 175 Visits: 596
yes whatever,

select distinct a.PersonID,a.Version, a.DEDate , case when b.PersonID is null then 0 else 1 end AS [Is ImProper]
from PersonRecord a
left join (
select a.PersonID
from PersonRecord a
inner join PersonRecord b on a.PersonID = b.PersonID and a.Version>b.Version and a.DEDate<b.DEDate
) b on a.PersonID = b.PersonID

but initial tast was just get bad data entry dates
Carla Wilson-484785
Carla Wilson-484785
SSCrazy
SSCrazy (2.3K reputation)SSCrazy (2.3K reputation)SSCrazy (2.3K reputation)SSCrazy (2.3K reputation)SSCrazy (2.3K reputation)SSCrazy (2.3K reputation)SSCrazy (2.3K reputation)SSCrazy (2.3K reputation)

Group: General Forum Members
Points: 2290 Visits: 1951
AlexT (11/13/2008)
yes whatever,

select distinct a.PersonID,a.Version, a.DEDate , case when b.PersonID is null then 0 else 1 end AS [Is ImProper]
from PersonRecord a
left join (
select a.PersonID
from PersonRecord a
inner join PersonRecord b on a.PersonID = b.PersonID and a.Version>b.Version and a.DEDate<b.DEDate
) b on a.PersonID = b.PersonID

but initial tast was just get bad data entry dates


Thanks, this achieves the desired results, and it's a straight-forward, easy-to-read solution.

When I tested it on my 5 million row table, total execution time was 164580 Milliseconds duration

Edited: Although I think I would put the "select distinct" in the subselect, for cases where one PersonId has multiple rows that are out of order.
AlexT-359331
AlexT-359331
SSC-Enthusiastic
SSC-Enthusiastic (175 reputation)SSC-Enthusiastic (175 reputation)SSC-Enthusiastic (175 reputation)SSC-Enthusiastic (175 reputation)SSC-Enthusiastic (175 reputation)SSC-Enthusiastic (175 reputation)SSC-Enthusiastic (175 reputation)SSC-Enthusiastic (175 reputation)

Group: General Forum Members
Points: 175 Visits: 596
that supposed to be faster, if clustered index applied on (PersonID, Version )

select distinct a.PersonID,a.Version, a.DEDate , case when b.PersonID is null then 0 else 1 end AS [Is ImProper]
from PersonRecord a
left join (
select a.PersonID, a.Version
from PersonRecord a
inner join PersonRecord b on a.PersonID = b.PersonID and a.Version>b.Version and a.DEDate<b.DEDate
) b on a.PersonID = b.PersonID and a.Version=b.Version

thanks
Carla Wilson-484785
Carla Wilson-484785
SSCrazy
SSCrazy (2.3K reputation)SSCrazy (2.3K reputation)SSCrazy (2.3K reputation)SSCrazy (2.3K reputation)SSCrazy (2.3K reputation)SSCrazy (2.3K reputation)SSCrazy (2.3K reputation)SSCrazy (2.3K reputation)

Group: General Forum Members
Points: 2290 Visits: 1951
AlexT (11/13/2008)
that supposed to be faster, if clustered index applied on (PersonID, Version )

select distinct a.PersonID,a.Version, a.DEDate , case when b.PersonID is null then 0 else 1 end AS [Is ImProper]
from PersonRecord a
left join (
select a.PersonID, a.Version
from PersonRecord a
inner join PersonRecord b on a.PersonID = b.PersonID and a.Version>b.Version and a.DEDate<b.DEDate
) b on a.PersonID = b.PersonID and a.Version=b.Version

thanks



Good point about the fact that Version is part of the clustered index.
Unfortunately, incorporating version into the subselct breaks the solution. It no longer returns the desired results.
AlexT-359331
AlexT-359331
SSC-Enthusiastic
SSC-Enthusiastic (175 reputation)SSC-Enthusiastic (175 reputation)SSC-Enthusiastic (175 reputation)SSC-Enthusiastic (175 reputation)SSC-Enthusiastic (175 reputation)SSC-Enthusiastic (175 reputation)SSC-Enthusiastic (175 reputation)SSC-Enthusiastic (175 reputation)

Group: General Forum Members
Points: 175 Visits: 596
yes sorry, my mistake. if you don't mind you can try this

select a.PersonID, a.Version, a.DEDate,
case when b.PersonID is null then 0 else 1 end AS [Is ImProper]
from PersonRecord a
left join (
select distinct a.PersonID
from PersonRecord a
inner join PersonRecord b on a.PersonID = b.PersonID and a.Version>b.Version and a.DEDate<b.DEDate
) b on a.PersonID = b.PersonID

thanks
Carla Wilson-484785
Carla Wilson-484785
SSCrazy
SSCrazy (2.3K reputation)SSCrazy (2.3K reputation)SSCrazy (2.3K reputation)SSCrazy (2.3K reputation)SSCrazy (2.3K reputation)SSCrazy (2.3K reputation)SSCrazy (2.3K reputation)SSCrazy (2.3K reputation)

Group: General Forum Members
Points: 2290 Visits: 1951
AlexT (11/13/2008)
yes sorry, my mistake. if you don't mind you can try this

select a.PersonID, a.Version, a.DEDate,
case when b.PersonID is null then 0 else 1 end AS [Is ImProper]
from PersonRecord a
left join (
select distinct a.PersonID
from PersonRecord a
inner join PersonRecord b on a.PersonID = b.PersonID and a.Version>b.Version and a.DEDate<b.DEDate
) b on a.PersonID = b.PersonID



This query does produce the desired effect, and yes, moving the distinct inside the subselect does result in a significant performance lift. Total execution time was 144470 Milliseconds! (vs. previous run time of was 164580 Milliseconds)
Go


Permissions

You can't post new topics.
You can't post topic replies.
You can't post new polls.
You can't post replies to polls.
You can't edit your own topics.
You can't delete your own topics.
You can't edit other topics.
You can't delete other topics.
You can't edit your own posts.
You can't edit other posts.
You can't delete your own posts.
You can't delete other posts.
You can't post events.
You can't edit your own events.
You can't edit other events.
You can't delete your own events.
You can't delete other events.
You can't send private messages.
You can't send emails.
You can read topics.
You can't vote in polls.
You can't upload attachments.
You can download attachments.
You can't post HTML code.
You can't edit HTML code.
You can't post IFCode.
You can't post JavaScript.
You can post emoticons.
You can't post or upload images.

Select a forum

































































































































































SQLServerCentral


Search