Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 
        
Home       Members    Calendar    Who's On


Add to briefcase

Obtaining earliest change for one colum? Expand / Collapse
Author
Message
Posted Thursday, April 11, 2013 10:07 AM
Old Hand

Old HandOld HandOld HandOld HandOld HandOld HandOld HandOld Hand

Group: General Forum Members
Last Login: Monday, March 31, 2014 9:19 AM
Points: 314, Visits: 677
Guys,

I've been looking at this problem on and off for a while and I can't for the life of me find a solution. Basically we have an audit system where by every time someone changes one field in one table it appends the entire row into another table, a slight simplification but you get the idea.

What I want to get is a historical record for one field over time - so say this was a list of individuals and it had their wealth in dollars, their job title, their address, their age etc. It may be that their wealth in dollars didn't change over the course of say 10 rows in the table but other variables did change, I'm trying to build up a picture of their wealth changing. Totally random example but I thought it might help!

With this test data:

SELECT * INTO #test
FROM
(
SELECT 1 as Audit_ID, 123 as EntID, 100 as Field_A, 10 as Field_B, '01-Jan-12' as ActionDate UNION
SELECT 2 as Audit_ID, 123 as EntID, 110 as Field_A, 10 as Field_B, '15-Feb-12' as ActionDate UNION
SELECT 3 as Audit_ID, 123 as EntID, 150 as Field_A, 15 as Field_B, '20-Mar-12' as ActionDate UNION
SELECT 4 as Audit_ID, 123 as EntID, 100 as Field_A, 10 as Field_B, '12-Jun-12' as ActionDate UNION
SELECT 5 as Audit_ID, 445 as EntID, 50 as Field_A, 4 as Field_B, '02-Apr-12' as ActionDate UNION
SELECT 6 as Audit_ID, 445 as EntID, 60 as Field_A, 3 as Field_B, '15-May-12' as ActionDate UNION
SELECT 7 as Audit_ID, 445 as EntID, 30 as Field_A, 3 as Field_B, '18-Aug-12' as ActionDate UNION
SELECT 8 as Audit_ID, 552 as EntID, 500 as Field_A, 1 as Field_B, '10-Jan-12' as ActionDate UNION
SELECT 9 as Audit_ID, 552 as EntID, 450 as Field_A, 3 as Field_B, '15-Mar-12' as ActionDate UNION
SELECT 10 as Audit_ID, 552 as EntID, 320 as Field_A, 4 as Field_B, '29-Aug-12' as ActionDate
)x


Say I was trying to track field B over time for Entity (EntID) 123, I'd want a results set like this:

01-Jan-12 10
20-Mar-12 15
12-Jun-12 10

I thought I could do some kind of ranking whereby I partitioned on the field I was interested on and ranked on date, and chose those with rank 1, however, this doesn't take account for the same value at different time points - e.g. using the wealth example perhaps in Jan 2010 I had $200, in Feb 2010 this went up to 350 but then it fell to $200 again in September 2010, with my ranking idea I'd just have Feb 2010 and 200, not September too.

I thought about some form of 'double join' type thing along hte lines of Field B <> Field B, I've also got pratically no experience of them but for some reason I'm wondering if a CTE is called for?!

Hoepfully I've explained this not too bad and my test data gives you kind of an idea!

Thanks for reading :)
Post #1441357
Posted Thursday, April 11, 2013 12:05 PM
Forum Newbie

Forum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum Newbie

Group: General Forum Members
Last Login: Yesterday @ 9:06 AM
Points: 4, Visits: 217
You were on the right track with the self join
SELECT t1.ActionDate, t1.Field_B
FROM #test t1
LEFT JOIN #test t2 ON t1.Audit_ID = t2.Audit_ID-1
WHERE t1.Field_B != t2.Field_B OR t2.Audit_ID IS NULL

Post #1441408
Posted Thursday, April 11, 2013 12:26 PM
Forum Newbie

Forum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum Newbie

Group: General Forum Members
Last Login: Yesterday @ 9:06 AM
Points: 4, Visits: 217
Forgot the entity chunk. Yes you will need a CTE and a Row_Number function.

DECLARE @EntityID INT = 123;

WITH cte
AS (
SELECT row_Number() OVER (ORDER BY Audit_ID) AS RN
,t.Field_B
,t.ActionDate
FROM #test t
WHERE @EntityID = t.EntID
)
SELECT t1.ActionDate
,t1.Field_B
FROM cte t1
LEFT JOIN cte t2 ON t1.RN = t2.RN - 1
WHERE t1.Field_B != t2.Field_B
OR t2.RN IS NULL

Post #1441423
Posted Thursday, April 11, 2013 1:18 PM


SSC-Insane

SSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-Insane

Group: General Forum Members
Last Login: Yesterday @ 5:09 PM
Points: 23,089, Visits: 31,635
Here is another option:


SELECT
Audit_ID,
EntID,
Field_A,
Field_B,
ActionDate,
rn3 = ROW_NUMBER() OVER (PARTITION BY EntID ORDER BY ActionDate) - ROW_NUMBER() OVER (PARTITION BY EntID, Field_B ORDER BY ActionDate)
INTO #test
FROM
(
SELECT 1 as Audit_ID, 123 as EntID, 100 as Field_A, 10 as Field_B, CAST('01-Jan-12' AS DATETIME) as ActionDate UNION
SELECT 2 as Audit_ID, 123 as EntID, 110 as Field_A, 10 as Field_B, CAST('15-Feb-12' AS DATETIME) as ActionDate UNION
SELECT 3 as Audit_ID, 123 as EntID, 150 as Field_A, 15 as Field_B, CAST('20-Mar-12' AS DATETIME) as ActionDate UNION
SELECT 4 as Audit_ID, 123 as EntID, 100 as Field_A, 10 as Field_B, CAST('12-Jun-12' AS DATETIME) as ActionDate UNION
SELECT 5 as Audit_ID, 445 as EntID, 50 as Field_A, 4 as Field_B, CAST('02-Apr-12' AS DATETIME) as ActionDate UNION
SELECT 6 as Audit_ID, 445 as EntID, 60 as Field_A, 3 as Field_B, CAST('15-May-12' AS DATETIME) as ActionDate UNION
SELECT 7 as Audit_ID, 445 as EntID, 30 as Field_A, 3 as Field_B, CAST('18-Aug-12' AS DATETIME) as ActionDate UNION
SELECT 8 as Audit_ID, 552 as EntID, 500 as Field_A, 1 as Field_B, CAST('10-Jan-12' AS DATETIME) as ActionDate UNION
SELECT 9 as Audit_ID, 552 as EntID, 450 as Field_A, 3 as Field_B, CAST('15-Mar-12' AS DATETIME) as ActionDate UNION
SELECT 10 as Audit_ID, 552 as EntID, 320 as Field_A, 4 as Field_B, CAST('29-Aug-12' AS DATETIME) as ActionDate
)x
GO

WITH BaseData AS (
SELECT
Audit_ID,
EntID,
Field_A,
Field_B,
ActionDate,
rn3 = ROW_NUMBER() OVER (PARTITION BY EntID ORDER BY ActionDate) - ROW_NUMBER() OVER (PARTITION BY EntID, Field_B ORDER BY ActionDate)
FROM
#test
), IntermediateData AS (
SELECT
*,
rn = ROW_NUMBER() OVER (PARTITION BY EntID, rn3 ORDER BY ActionDate)
FROM
BaseData
)
SELECT
*
FROM
IntermediateData
WHERE
rn = 1
ORDER BY
EntID,
ActionDate;
GO

DROP TABLE #test;
GO





Lynn Pettis

For better assistance in answering your questions, click here
For tips to get better help with Performance Problems, click here
For Running Totals and its variations, click here or when working with partitioned tables
For more about Tally Tables, click here
For more about Cross Tabs and Pivots, click here and here
Managing Transaction Logs

SQL Musings from the Desert Fountain Valley SQL (My Mirror Blog)
Post #1441454
Posted Friday, April 12, 2013 4:16 AM
Old Hand

Old HandOld HandOld HandOld HandOld HandOld HandOld HandOld Hand

Group: General Forum Members
Last Login: Monday, March 31, 2014 9:19 AM
Points: 314, Visits: 677
Thanks for the responses guys, Lynn, I went with your method in the end, it's exactly what I was looking for!

I'm glad I was on the right track knowing some form of ranking and or self joins and CTEs were called for, I just couldn't quite get it down. When I get some time (who know's when that'll be!) i really need to get stuck into some CTE testing to see what they can do.

Thanks again :)
Post #1441627
Posted Friday, April 12, 2013 6:35 AM


SSCertifiable

SSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiable

Group: General Forum Members
Last Login: Yesterday @ 8:56 AM
Points: 7,137, Visits: 13,527
Here's an alternative with no CTE or ranking function:
SELECT 
s.*,
y.ReasonToKeep
FROM Sampledata s -- CI scan
OUTER APPLY ( -- CI seek possible
SELECT TOP 1 *
FROM Sampledata i
WHERE i.EntID = s.EntID
AND i.ActionDate < s.ActionDate
ORDER BY i.EntID, i.ActionDate DESC) x
CROSS APPLY ( -- compute scalar
SELECT
ReasonToKeep = CASE
WHEN x.Field_B IS NULL THEN 'FIRST'
WHEN x.Field_B <> s.Field_B THEN 'CHANGED'
END
) y
WHERE y.ReasonToKeep IS NOT NULL



“Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden
Exploring Recursive CTEs by Example Dwain Camps
Post #1441669
Posted Friday, April 12, 2013 7:59 AM


SSC-Insane

SSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-Insane

Group: General Forum Members
Last Login: Yesterday @ 5:09 PM
Points: 23,089, Visits: 31,635
The CTE and NoCTE versions are comparable on a small data set. I'd like to try them both on a million row data set and see how they compare.

Perhaps this evening I will have a chance to set it up.



Lynn Pettis

For better assistance in answering your questions, click here
For tips to get better help with Performance Problems, click here
For Running Totals and its variations, click here or when working with partitioned tables
For more about Tally Tables, click here
For more about Cross Tabs and Pivots, click here and here
Managing Transaction Logs

SQL Musings from the Desert Fountain Valley SQL (My Mirror Blog)
Post #1441715
Posted Friday, April 12, 2013 8:02 AM


SSCertifiable

SSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiable

Group: General Forum Members
Last Login: Yesterday @ 8:56 AM
Points: 7,137, Visits: 13,527
Lynn Pettis (4/12/2013)
The CTE and NoCTE versions are comparable on a small data set. I'd like to try them both on a million row data set and see how they compare.

Perhaps this evening I will have a chance to set it up.


Heh if it's a race Lynn, I reserve the right to change shoes!!


“Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden
Exploring Recursive CTEs by Example Dwain Camps
Post #1441720
Posted Friday, April 12, 2013 8:04 AM
Old Hand

Old HandOld HandOld HandOld HandOld HandOld HandOld HandOld Hand

Group: General Forum Members
Last Login: Monday, March 31, 2014 9:19 AM
Points: 314, Visits: 677
I'd like to have time to try the above non CTE approach, time doesn't allow at the moment but hopefully at some point.

FYI the #test replica on my 'real' data yields about 600,000 rows so it's not an insignificant data set, however, it's probably quite a rare request (well, I've not had to dig into these tables for this info ever and we've had them on the go since 2008!)
Post #1441721
Posted Friday, April 12, 2013 8:11 AM


SSCertifiable

SSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiable

Group: General Forum Members
Last Login: Yesterday @ 8:56 AM
Points: 7,137, Visits: 13,527
Rob-350472 (4/12/2013)
I'd like to have time to try the above non CTE approach, time doesn't allow at the moment but hopefully at some point.

FYI the #test replica on my 'real' data yields about 600,000 rows so it's not an insignificant data set, however, it's probably quite a rare request (well, I've not had to dig into these tables for this info ever and we've had them on the go since 2008!)


I know from recent experiments with similar Gaps'n'Islands solutions that the query I posted will work well providing there's an index to support seeks where shown - preferably unique. It performs best when the number of rows returned is significantly less than the total number of rows in the table and works well over many millions of rows.


“Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden
Exploring Recursive CTEs by Example Dwain Camps
Post #1441725
« Prev Topic | Next Topic »

Add to briefcase

Permissions Expand / Collapse