SQL Clone
SQLServerCentral is supported by Redgate
 
Log in  ::  Register  ::  Not logged in
 
 
 


Obtaining earliest change for one colum?


Obtaining earliest change for one colum?

Author
Message
Rob-350472
Rob-350472
SSChasing Mays
SSChasing Mays (649 reputation)SSChasing Mays (649 reputation)SSChasing Mays (649 reputation)SSChasing Mays (649 reputation)SSChasing Mays (649 reputation)SSChasing Mays (649 reputation)SSChasing Mays (649 reputation)SSChasing Mays (649 reputation)

Group: General Forum Members
Points: 649 Visits: 684
Guys,

I've been looking at this problem on and off for a while and I can't for the life of me find a solution. Basically we have an audit system where by every time someone changes one field in one table it appends the entire row into another table, a slight simplification but you get the idea.

What I want to get is a historical record for one field over time - so say this was a list of individuals and it had their wealth in dollars, their job title, their address, their age etc. It may be that their wealth in dollars didn't change over the course of say 10 rows in the table but other variables did change, I'm trying to build up a picture of their wealth changing. Totally random example but I thought it might help!

With this test data:

SELECT * INTO #test
FROM
(
SELECT 1 as Audit_ID, 123 as EntID, 100 as Field_A, 10 as Field_B, '01-Jan-12' as ActionDate UNION
SELECT 2 as Audit_ID, 123 as EntID, 110 as Field_A, 10 as Field_B, '15-Feb-12' as ActionDate UNION
SELECT 3 as Audit_ID, 123 as EntID, 150 as Field_A, 15 as Field_B, '20-Mar-12' as ActionDate UNION
SELECT 4 as Audit_ID, 123 as EntID, 100 as Field_A, 10 as Field_B, '12-Jun-12' as ActionDate UNION
SELECT 5 as Audit_ID, 445 as EntID, 50 as Field_A, 4 as Field_B, '02-Apr-12' as ActionDate UNION
SELECT 6 as Audit_ID, 445 as EntID, 60 as Field_A, 3 as Field_B, '15-May-12' as ActionDate UNION
SELECT 7 as Audit_ID, 445 as EntID, 30 as Field_A, 3 as Field_B, '18-Aug-12' as ActionDate UNION
SELECT 8 as Audit_ID, 552 as EntID, 500 as Field_A, 1 as Field_B, '10-Jan-12' as ActionDate UNION
SELECT 9 as Audit_ID, 552 as EntID, 450 as Field_A, 3 as Field_B, '15-Mar-12' as ActionDate UNION
SELECT 10 as Audit_ID, 552 as EntID, 320 as Field_A, 4 as Field_B, '29-Aug-12' as ActionDate
)x


Say I was trying to track field B over time for Entity (EntID) 123, I'd want a results set like this:

01-Jan-12 10
20-Mar-12 15
12-Jun-12 10

I thought I could do some kind of ranking whereby I partitioned on the field I was interested on and ranked on date, and chose those with rank 1, however, this doesn't take account for the same value at different time points - e.g. using the wealth example perhaps in Jan 2010 I had $200, in Feb 2010 this went up to 350 but then it fell to $200 again in September 2010, with my ranking idea I'd just have Feb 2010 and 200, not September too.

I thought about some form of 'double join' type thing along hte lines of Field B <> Field B, I've also got pratically no experience of them but for some reason I'm wondering if a CTE is called for?!

Hoepfully I've explained this not too bad and my test data gives you kind of an idea!

Thanks for reading Smile
natevdh
natevdh
Forum Newbie
Forum Newbie (5 reputation)Forum Newbie (5 reputation)Forum Newbie (5 reputation)Forum Newbie (5 reputation)Forum Newbie (5 reputation)Forum Newbie (5 reputation)Forum Newbie (5 reputation)Forum Newbie (5 reputation)

Group: General Forum Members
Points: 5 Visits: 342
You were on the right track with the self join

SELECT t1.ActionDate, t1.Field_B
FROM #test t1
LEFT JOIN #test t2 ON t1.Audit_ID = t2.Audit_ID-1
WHERE t1.Field_B != t2.Field_B OR t2.Audit_ID IS NULL


natevdh
natevdh
Forum Newbie
Forum Newbie (5 reputation)Forum Newbie (5 reputation)Forum Newbie (5 reputation)Forum Newbie (5 reputation)Forum Newbie (5 reputation)Forum Newbie (5 reputation)Forum Newbie (5 reputation)Forum Newbie (5 reputation)

Group: General Forum Members
Points: 5 Visits: 342
Forgot the entity chunk. Yes you will need a CTE and a Row_Number function.


DECLARE @EntityID INT = 123;

WITH cte
AS (
SELECT row_Number() OVER (ORDER BY Audit_ID) AS RN
,t.Field_B
,t.ActionDate
FROM #test t
WHERE @EntityID = t.EntID
)
SELECT t1.ActionDate
,t1.Field_B
FROM cte t1
LEFT JOIN cte t2 ON t1.RN = t2.RN - 1
WHERE t1.Field_B != t2.Field_B
OR t2.RN IS NULL


Lynn Pettis
Lynn Pettis
SSC-Forever
SSC-Forever (40K reputation)SSC-Forever (40K reputation)SSC-Forever (40K reputation)SSC-Forever (40K reputation)SSC-Forever (40K reputation)SSC-Forever (40K reputation)SSC-Forever (40K reputation)SSC-Forever (40K reputation)

Group: General Forum Members
Points: 40000 Visits: 38564
Here is another option:



SELECT
Audit_ID,
EntID,
Field_A,
Field_B,
ActionDate,
rn3 = ROW_NUMBER() OVER (PARTITION BY EntID ORDER BY ActionDate) - ROW_NUMBER() OVER (PARTITION BY EntID, Field_B ORDER BY ActionDate)
INTO #test
FROM
(
SELECT 1 as Audit_ID, 123 as EntID, 100 as Field_A, 10 as Field_B, CAST('01-Jan-12' AS DATETIME) as ActionDate UNION
SELECT 2 as Audit_ID, 123 as EntID, 110 as Field_A, 10 as Field_B, CAST('15-Feb-12' AS DATETIME) as ActionDate UNION
SELECT 3 as Audit_ID, 123 as EntID, 150 as Field_A, 15 as Field_B, CAST('20-Mar-12' AS DATETIME) as ActionDate UNION
SELECT 4 as Audit_ID, 123 as EntID, 100 as Field_A, 10 as Field_B, CAST('12-Jun-12' AS DATETIME) as ActionDate UNION
SELECT 5 as Audit_ID, 445 as EntID, 50 as Field_A, 4 as Field_B, CAST('02-Apr-12' AS DATETIME) as ActionDate UNION
SELECT 6 as Audit_ID, 445 as EntID, 60 as Field_A, 3 as Field_B, CAST('15-May-12' AS DATETIME) as ActionDate UNION
SELECT 7 as Audit_ID, 445 as EntID, 30 as Field_A, 3 as Field_B, CAST('18-Aug-12' AS DATETIME) as ActionDate UNION
SELECT 8 as Audit_ID, 552 as EntID, 500 as Field_A, 1 as Field_B, CAST('10-Jan-12' AS DATETIME) as ActionDate UNION
SELECT 9 as Audit_ID, 552 as EntID, 450 as Field_A, 3 as Field_B, CAST('15-Mar-12' AS DATETIME) as ActionDate UNION
SELECT 10 as Audit_ID, 552 as EntID, 320 as Field_A, 4 as Field_B, CAST('29-Aug-12' AS DATETIME) as ActionDate
)x
GO

WITH BaseData AS (
SELECT
Audit_ID,
EntID,
Field_A,
Field_B,
ActionDate,
rn3 = ROW_NUMBER() OVER (PARTITION BY EntID ORDER BY ActionDate) - ROW_NUMBER() OVER (PARTITION BY EntID, Field_B ORDER BY ActionDate)
FROM
#test
), IntermediateData AS (
SELECT
*,
rn = ROW_NUMBER() OVER (PARTITION BY EntID, rn3 ORDER BY ActionDate)
FROM
BaseData
)
SELECT
*
FROM
IntermediateData
WHERE
rn = 1
ORDER BY
EntID,
ActionDate;
GO

DROP TABLE #test;
GO




Cool
Lynn Pettis

For better assistance in answering your questions, click here
For tips to get better help with Performance Problems, click here
For Running Totals and its variations, click here or when working with partitioned tables
For more about Tally Tables, click here
For more about Cross Tabs and Pivots, click here and here
Managing Transaction Logs

SQL Musings from the Desert Fountain Valley SQL (My Mirror Blog)
Rob-350472
Rob-350472
SSChasing Mays
SSChasing Mays (649 reputation)SSChasing Mays (649 reputation)SSChasing Mays (649 reputation)SSChasing Mays (649 reputation)SSChasing Mays (649 reputation)SSChasing Mays (649 reputation)SSChasing Mays (649 reputation)SSChasing Mays (649 reputation)

Group: General Forum Members
Points: 649 Visits: 684
Thanks for the responses guys, Lynn, I went with your method in the end, it's exactly what I was looking for!

I'm glad I was on the right track knowing some form of ranking and or self joins and CTEs were called for, I just couldn't quite get it down. When I get some time (who know's when that'll be!) i really need to get stuck into some CTE testing to see what they can do.

Thanks again Smile
ChrisM@Work
ChrisM@Work
SSCoach
SSCoach (16K reputation)SSCoach (16K reputation)SSCoach (16K reputation)SSCoach (16K reputation)SSCoach (16K reputation)SSCoach (16K reputation)SSCoach (16K reputation)SSCoach (16K reputation)

Group: General Forum Members
Points: 16484 Visits: 19557
Here's an alternative with no CTE or ranking function:
SELECT 
s.*,
y.ReasonToKeep
FROM Sampledata s -- CI scan
OUTER APPLY ( -- CI seek possible
SELECT TOP 1 *
FROM Sampledata i
WHERE i.EntID = s.EntID
AND i.ActionDate < s.ActionDate
ORDER BY i.EntID, i.ActionDate DESC) x
CROSS APPLY ( -- compute scalar
SELECT
ReasonToKeep = CASE
WHEN x.Field_B IS NULL THEN 'FIRST'
WHEN x.Field_B <> s.Field_B THEN 'CHANGED'
END
) y
WHERE y.ReasonToKeep IS NOT NULL



“Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden
Exploring Recursive CTEs by Example Dwain Camps
Lynn Pettis
Lynn Pettis
SSC-Forever
SSC-Forever (40K reputation)SSC-Forever (40K reputation)SSC-Forever (40K reputation)SSC-Forever (40K reputation)SSC-Forever (40K reputation)SSC-Forever (40K reputation)SSC-Forever (40K reputation)SSC-Forever (40K reputation)

Group: General Forum Members
Points: 40000 Visits: 38564
The CTE and NoCTE versions are comparable on a small data set. I'd like to try them both on a million row data set and see how they compare.

Perhaps this evening I will have a chance to set it up.

Cool
Lynn Pettis

For better assistance in answering your questions, click here
For tips to get better help with Performance Problems, click here
For Running Totals and its variations, click here or when working with partitioned tables
For more about Tally Tables, click here
For more about Cross Tabs and Pivots, click here and here
Managing Transaction Logs

SQL Musings from the Desert Fountain Valley SQL (My Mirror Blog)
ChrisM@Work
ChrisM@Work
SSCoach
SSCoach (16K reputation)SSCoach (16K reputation)SSCoach (16K reputation)SSCoach (16K reputation)SSCoach (16K reputation)SSCoach (16K reputation)SSCoach (16K reputation)SSCoach (16K reputation)

Group: General Forum Members
Points: 16484 Visits: 19557
Lynn Pettis (4/12/2013)
The CTE and NoCTE versions are comparable on a small data set. I'd like to try them both on a million row data set and see how they compare.

Perhaps this evening I will have a chance to set it up.


Heh if it's a race Lynn, I reserve the right to change shoes!!

“Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden
Exploring Recursive CTEs by Example Dwain Camps
Rob-350472
Rob-350472
SSChasing Mays
SSChasing Mays (649 reputation)SSChasing Mays (649 reputation)SSChasing Mays (649 reputation)SSChasing Mays (649 reputation)SSChasing Mays (649 reputation)SSChasing Mays (649 reputation)SSChasing Mays (649 reputation)SSChasing Mays (649 reputation)

Group: General Forum Members
Points: 649 Visits: 684
I'd like to have time to try the above non CTE approach, time doesn't allow at the moment but hopefully at some point.

FYI the #test replica on my 'real' data yields about 600,000 rows so it's not an insignificant data set, however, it's probably quite a rare request (well, I've not had to dig into these tables for this info ever and we've had them on the go since 2008!)
ChrisM@Work
ChrisM@Work
SSCoach
SSCoach (16K reputation)SSCoach (16K reputation)SSCoach (16K reputation)SSCoach (16K reputation)SSCoach (16K reputation)SSCoach (16K reputation)SSCoach (16K reputation)SSCoach (16K reputation)

Group: General Forum Members
Points: 16484 Visits: 19557
Rob-350472 (4/12/2013)
I'd like to have time to try the above non CTE approach, time doesn't allow at the moment but hopefully at some point.

FYI the #test replica on my 'real' data yields about 600,000 rows so it's not an insignificant data set, however, it's probably quite a rare request (well, I've not had to dig into these tables for this info ever and we've had them on the go since 2008!)


I know from recent experiments with similar Gaps'n'Islands solutions that the query I posted will work well providing there's an index to support seeks where shown - preferably unique. It performs best when the number of rows returned is significantly less than the total number of rows in the table and works well over many millions of rows.

“Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden
Exploring Recursive CTEs by Example Dwain Camps
Go


Permissions

You can't post new topics.
You can't post topic replies.
You can't post new polls.
You can't post replies to polls.
You can't edit your own topics.
You can't delete your own topics.
You can't edit other topics.
You can't delete other topics.
You can't edit your own posts.
You can't edit other posts.
You can't delete your own posts.
You can't delete other posts.
You can't post events.
You can't edit your own events.
You can't edit other events.
You can't delete your own events.
You can't delete other events.
You can't send private messages.
You can't send emails.
You can read topics.
You can't vote in polls.
You can't upload attachments.
You can download attachments.
You can't post HTML code.
You can't edit HTML code.
You can't post IFCode.
You can't post JavaScript.
You can post emoticons.
You can't post or upload images.

Select a forum

































































































































































SQLServerCentral


Search