SQL Clone
SQLServerCentral is supported by Redgate
 
Log in  ::  Register  ::  Not logged in
 
 
 


Find Changed Rows (and the row prior to the changes)


Find Changed Rows (and the row prior to the changes)

Author
Message
michael.minarzick 97083
michael.minarzick 97083
Forum Newbie
Forum Newbie (5 reputation)Forum Newbie (5 reputation)Forum Newbie (5 reputation)Forum Newbie (5 reputation)Forum Newbie (5 reputation)Forum Newbie (5 reputation)Forum Newbie (5 reputation)Forum Newbie (5 reputation)

Group: General Forum Members
Points: 5 Visits: 40
Thank you for reading. I have a table (represented by #Events) that holds modifications made to another table. I do have some control over the table structure and indexing. I want to pull all of the change records that were made between two dates.

The tricky part is to include the previous version of each record, which will usually be found prior to the start date in question.

The code that I have provided below works. So you can use it to easily see what should be returned. But it's very slow in production.

Can you think of a better method to pull this data together?

Thank you in advance.



-- Production version of this table has 4.5 million rows (roughly 1,000 rows per day)
-- Primary key is on L4Ident (clustered)
-- nonclustered index on ProcessDate, LinkRL4
DROP TABLE dbo.#Events;
DROP TABLE dbo.#Results;

CREATE TABLE dbo.#Events (
L4Ident int IDENTITY(1,1) NOT NULL,
LinkRL4 int NOT NULL,
ProcessDate date NOT NULL,
UserName char(10) NOT NULL,
SequenceNumber int NOT NULL,
JobDescription char(10) NOT NULL,
EntryType char(7) NOT NULL,
ChangingField char(10) NOT NULL);

CREATE TABLE dbo.#Results (
ResultsID int IDENTITY(1,1) NOT NULL,
L4Ident int NOT NULL,
LinkRL4 int NOT NULL,
ProcessDate date NOT NULL,
UserName char(10) NOT NULL,
SequenceNumber int NOT NULL,
JobDescription char(10) NOT NULL,
EntryType char(7) NOT NULL,
ChangingField char(10) NOT NULL,
RowNumber smallint NOT NULL);

INSERT dbo.#Events (LinkRL4, ProcessDate, UserName, SequenceNumber, JobDescription, EntryType, ChangingField)
VALUES (111, '2014-06-01', 'Michael', 1, 'Daily', 'INSERT', 'GM'),
(111, '2014-06-09', 'Michael', 12, 'Daily', 'UPDATE', 'GMC'),
(111, '2014-06-15', 'Rachel', 19, 'Daily', 'UPDATE', 'GMotors'),
(111, '2014-07-01', 'Rachel', 24, 'Daily', 'UPDATE', 'Motors'),
(111, '2014-07-05', 'Rachel', 39, 'Daily', 'UPDATE', 'GenMotors');

INSERT dbo.#Events (LinkRL4, ProcessDate, UserName, SequenceNumber, JobDescription, EntryType, ChangingField)
VALUES (112, '2014-06-04', 'Thomas', 6, 'Daily', 'INSERT', 'MB'),
(112, '2014-06-07', 'Thomas', 10, 'Daily', 'UPDATE', 'Mercedes'),
(112, '2014-06-15', 'Thomas', 20, 'Daily', 'UPDATE', 'MBenz'),
(112, '2014-07-03', 'Michael', 27, 'Daily', 'UPDATE', 'Benz'),
(112, '2014-07-05', 'Rachel', 40, 'Daily', 'UPDATE', 'Mer Benz');

INSERT dbo.#Events (LinkRL4, ProcessDate, UserName, SequenceNumber, JobDescription, EntryType, ChangingField)
VALUES (113, '2014-07-06', 'Michael', 55, 'Daily', 'INSERT', 'BMW');




TRUNCATE TABLE dbo.#Results;


DECLARE @StartDate date = '2014-06-12',
@EndDate date = '2014-07-06';

DECLARE @L4Ident int;

DECLARE @CurrentLinkRL4 int,
@MaxLinkRL4 int;

SELECT @CurrentLinkRL4 = MIN(LinkRL4),
@MaxLinkRL4 = MAX(LinkRL4)
FROM dbo.#Events
WHERE ProcessDate BETWEEN @StartDate AND @EndDate;

WHILE ((@CurrentLinkRL4 <= @MaxLinkRL4) AND (@CurrentLinkRL4 IS NOT NULL))
BEGIN

-- insert all records that match within the date range
INSERT dbo.#Results (
L4Ident,
LinkRL4,
ProcessDate,
UserName,
SequenceNumber,
JobDescription,
EntryType,
ChangingField,
RowNumber)
SELECT L4Ident,
LinkRL4,
ProcessDate,
UserName,
SequenceNumber,
JobDescription,
EntryType,
ChangingField,
RowNumber = RANK() OVER (PARTITION BY LinkRL4 ORDER BY L4Ident)
FROM dbo.#Events
WHERE LinkRL4 = @CurrentLinkRL4
AND ProcessDate BETWEEN @StartDate AND @EndDate;

-- find the record immediately prior to the date range
SELECT @L4Ident = MIN(L4Ident)
FROM dbo.#Results
WHERE LinkRL4 = @CurrentLinkRL4;

SELECT @L4Ident = MAX(L4Ident)
FROM dbo.#Events
WHERE LinkRL4 = @CurrentLinkRL4
AND L4Ident < @L4Ident;

-- add one record immediately prior to the date range
INSERT dbo.#Results (
L4Ident,
LinkRL4,
ProcessDate,
UserName,
SequenceNumber,
JobDescription,
EntryType,
ChangingField,
RowNumber)
SELECT L4Ident,
LinkRL4,
ProcessDate,
UserName,
SequenceNumber,
JobDescription,
EntryType,
ChangingField,
RowNumber = 0
FROM dbo.#Events
WHERE L4Ident = @L4Ident;

-- increment loop counter
SELECT @CurrentLinkRL4 = MIN(LinkRL4)
FROM dbo.#Events
WHERE ProcessDate BETWEEN @StartDate AND @EndDate
AND LinkRL4 > @CurrentLinkRL4;
END


SELECT *
FROM dbo.#Results
ORDER BY LinkRL4, ProcessDate, RowNumber
Peter Brinkhaus
Peter Brinkhaus
SSCrazy
SSCrazy (2K reputation)SSCrazy (2K reputation)SSCrazy (2K reputation)SSCrazy (2K reputation)SSCrazy (2K reputation)SSCrazy (2K reputation)SSCrazy (2K reputation)SSCrazy (2K reputation)

Group: General Forum Members
Points: 2048 Visits: 7369
You could use a single query like below. I have not tested it against a 4.5 million row input set, but it's probably faster then the loop and all the separate queries you are using right now. I recommend to include the column LinkRL4 to the non-clustered index on ProcessDate. If the zero-offset row numbering in case of a previous row outside the data range is not important, remove the CASE expression.


SELECT
evt.*,
RowNumber = ROW_NUMBER() OVER (PARTITION BY evt.LinkRL4 ORDER BY evt.L4Ident) -
CASE WHEN MIN(evt.ProcessDate) OVER (PARTITION BY evt.LinkRL4) < @StartDate THEN 1 ELSE 0 END
FROM
dbo.#Events evt
WHERE
evt.ProcessDate <= @EndDate
AND NOT EXISTS
(
SELECT
*
FROM
dbo.#Events evt2
WHERE
evt2.LinkRL4 = evt.LinkRL4
AND evt2.ProcessDate > evt.ProcessDate
AND evt2.ProcessDate < @StartDate
)
ORDER BY
evt.LinkRL4, evt.ProcessDate, evt.L4Ident


Mike Good
Mike Good
SSC Eights!
SSC Eights! (800 reputation)SSC Eights! (800 reputation)SSC Eights! (800 reputation)SSC Eights! (800 reputation)SSC Eights! (800 reputation)SSC Eights! (800 reputation)SSC Eights! (800 reputation)SSC Eights! (800 reputation)

Group: General Forum Members
Points: 800 Visits: 1033
Peter, I think you're on right track, but there is a flaw. The initial sample data does not show it, but when used with date range of 6/12 - 7/01 this data does:

INSERT   dbo.#Events (LinkRL4, ProcessDate, UserName, SequenceNumber, JobDescription, EntryType, ChangingField)
VALUES (111, '2014-06-01', 'Michael', 1, 'Daily', 'INSERT', 'GM'),
(111, '2014-06-09', 'Michael', 12, 'Daily', 'UPDATE', 'GMC'),
(111, '2014-06-15', 'Rachel', 19, 'Daily', 'UPDATE', 'GMotors'),
(111, '2014-07-01', 'Rachel', 24, 'Daily', 'UPDATE', 'Motors'),
(111, '2014-07-05', 'Rachel', 39, 'Daily', 'UPDATE', 'GenMotors');

INSERT dbo.#Events (LinkRL4, ProcessDate, UserName, SequenceNumber, JobDescription, EntryType, ChangingField)
VALUES (112, '2014-06-04', 'Thomas', 6, 'Daily', 'INSERT', 'MB'),
(112, '2014-06-07', 'Thomas', 10, 'Daily', 'UPDATE', 'Mercedes'),
(112, '2014-06-15', 'Thomas', 20, 'Daily', 'UPDATE', 'MBenz'),
(112, '2014-07-03', 'Michael', 27, 'Daily', 'UPDATE', 'Benz'),
(112, '2014-07-05', 'Rachel', 40, 'Daily', 'UPDATE', 'Mer Benz');

INSERT dbo.#Events (LinkRL4, ProcessDate, UserName, SequenceNumber, JobDescription, EntryType, ChangingField)
VALUES (113, '2014-07-06', 'Michael', 55, 'Daily', 'INSERT', 'BMW');

INSERT dbo.#Events (LinkRL4, ProcessDate, UserName, SequenceNumber, JobDescription, EntryType, ChangingField)
VALUES (114, '2014-06-01', 'Steve', 4, 'Daily', 'INSERT', 'VW'),
(114, '2014-06-02', 'Steve', 60, 'Daily', 'UPDATE', 'Volks');

INSERT dbo.#Events (LinkRL4, ProcessDate, UserName, SequenceNumber, JobDescription, EntryType, ChangingField)
VALUES (115, '2014-06-01', 'Stan', 2, 'Daily', 'INSERT', 'BMW'),
(115, '2014-06-02', 'Stan', 3, 'Daily', 'UPDATE', 'Beamer'),
(115, '2014-08-02', 'Stan', 80, 'Daily', 'UPDATE', 'BMW');

INSERT dbo.#Events (LinkRL4, ProcessDate, UserName, SequenceNumber, JobDescription, EntryType, ChangingField)
VALUES (116, '2014-06-25', 'Lee', 32, 'Daily', 'INSERT', 'BMW'),
(117, '2014-08-02', 'Lee', 81, 'Daily', 'UPDATE', 'BMW');
GO





Mike Good
Mike Good
SSC Eights!
SSC Eights! (800 reputation)SSC Eights! (800 reputation)SSC Eights! (800 reputation)SSC Eights! (800 reputation)SSC Eights! (800 reputation)SSC Eights! (800 reputation)SSC Eights! (800 reputation)SSC Eights! (800 reputation)

Group: General Forum Members
Points: 800 Visits: 1033
I think this works:

--first part gets events within the date range
SELECT evt.*
FROM dbo.#Events evt
WHERE evt.ProcessDate BETWEEN @StartDate AND @EndDate

--second part gets related events that occurred before
UNION ALL
SELECT Prior.*
FROM (SELECT LinkRL4, MIN(ProcessDate) AS ProcessDate FROM dbo.#Events evt WHERE ProcessDate BETWEEN @StartDate AND @EndDate GROUP BY LinkRL4) AS MinDate
CROSS APPLY (
SELECT TOP 1 evt.*
FROM dbo.#Events evt
WHERE evt.LinkRL4 = MinDate.LinkRL4
AND evt.ProcessDate < MinDate.ProcessDate
ORDER BY evt.ProcessDate DESC) as Prior
ORDER BY LinkRL4, ProcessDate, L4Ident





ScottPletcher
ScottPletcher
SSCertifiable
SSCertifiable (8K reputation)SSCertifiable (8K reputation)SSCertifiable (8K reputation)SSCertifiable (8K reputation)SSCertifiable (8K reputation)SSCertifiable (8K reputation)SSCertifiable (8K reputation)SSCertifiable (8K reputation)

Group: General Forum Members
Points: 7985 Visits: 7160
I'd also create the best indexes while re-working the code.

#Events indexes:
1) Unique Clustered on ( ProcessDate, L4Ident ) --in that order
2) Nonclustered on ( L4Ident )
3) NNonclustered on ( LinkRL4, L4Ident ) Include ( ProcessDate ) --yes, processdate would be auto-included anyway, but I like to explicitly specify it anyway, in case the clus index changes later.

SQL DBA,SQL Server MVP(07, 08, 09)[size=2]Prosecutor James Blackburn, in closing argument in the Fatal Vision murders trial: If in the future, you should cry a tear, cry one for them [the murder victims]. If in the future, you should say a prayer, say one for them. And if in the future, you should light a candle, light one for them.[/size]
Go


Permissions

You can't post new topics.
You can't post topic replies.
You can't post new polls.
You can't post replies to polls.
You can't edit your own topics.
You can't delete your own topics.
You can't edit other topics.
You can't delete other topics.
You can't edit your own posts.
You can't edit other posts.
You can't delete your own posts.
You can't delete other posts.
You can't post events.
You can't edit your own events.
You can't edit other events.
You can't delete your own events.
You can't delete other events.
You can't send private messages.
You can't send emails.
You can read topics.
You can't vote in polls.
You can't upload attachments.
You can download attachments.
You can't post HTML code.
You can't edit HTML code.
You can't post IFCode.
You can't post JavaScript.
You can post emoticons.
You can't post or upload images.

Select a forum

































































































































































SQLServerCentral


Search