Row Numbering using Partion By

  • We have a table that contains the ownership history of our assets (laptops/desktops). There is a job that runs every night that updates this table if ownership has changed. There was a bit of a developer error :blush: with my last update to this code which inadvertently inserted additional rows for a service tag/employee combination when it should not have. The first thing I want to do is clean up these duplicates in the table. I really thought I would knock this out quickly but have run into a problem. Here is my existing code using a temp table that mimics the history table:

    if OBJECT_ID('tempdb..#AssetHistory') is not null

    drop table #AssetHistory

    go

    set nocount on

    create table #AssetHistory

    (EmployeeID int,

    ServiceTag char(7),

    FirstName varchar(10),

    LastName varchar(20),

    Location varchar(20),

    LastUpdateDate datetime)

    insert into #AssetHistory values (100010151, '10CR4M1', 'Bruce', 'Robinson', 'Tampa', '2011-01-07 19:01:14.000')

    insert into #AssetHistory values (100011036, '10CR4M1', 'Christine', 'Zucker', 'Tampa', '2011-01-05 19:00:52.000')

    insert into #AssetHistory values (920000186, '10CR4M1', 'Duane', 'Roth', 'Tampa', '2010-07-28 19:08:51.000')

    insert into #AssetHistory values (999922222, '10CR4M1', 'Michael', 'Miller', 'Tampa', '2010-07-16 19:06:51.000')

    insert into #AssetHistory values (100096597, '10CW4Q1', 'Donna', 'Smith', 'Remote', '2011-05-12 10:24:45.000')

    insert into #AssetHistory values (100096597, '10CW4Q1', 'Donna', 'Smith', 'Remote', '2011-08-23 10:51:22.000')

    insert into #AssetHistory values (100096597, '10CW4Q1', 'Donna', 'Smith', 'Remote', '2011-09-17 10:52:59.000')

    insert into #AssetHistory values (100096597, '10CW4Q1', 'Donna', 'Smith', 'Remote', '2011-09-18 10:25:00.000')

    insert into #AssetHistory values (100096597, '10CW4Q1', 'Donna', 'Smith', 'Remote', '2011-09-21 13:48:29.000')

    insert into #AssetHistory values (100052969, '10YGVL1', 'Wendie', 'Williams', 'Irvine', '2010-05-24 14:29:36.000')

    insert into #AssetHistory values (100051487, '10YGVL1', 'Jennifer', 'Duane', 'Irvine', '2010-10-22 19:02:18.000')

    insert into #AssetHistory values (100052157, '10YGVL1', 'Lisa', 'Chung', 'Irvine', '2011-01-28 15:40:30.000')

    insert into #AssetHistory values (100052157, '10YGVL1', 'Lisa', 'Chung', 'Irvine', '2011-08-24 16:40:33.000')

    insert into #AssetHistory values (100052157, '10YGVL1', 'Lisa', 'Chung', 'Irvine', '2011-09-14 09:21:48.000')

    insert into #AssetHistory values (100052157, '10YGVL1', 'Lisa', 'Chung', 'Irvine', '2011-09-16 16:40:29.000')

    insert into #AssetHistory values (100014883, '128V1M1', 'Tom', 'Mower', 'Austin', '2010-05-24 14:29:08.000')

    insert into #AssetHistory values (100042636, '128V1M1', 'David', 'King', 'Chicago', '2010-08-26 19:03:24.000')

    insert into #AssetHistory values (100014883, '128V1M1', 'Tom', 'Mower', 'Austin', '2011-08-24 06:46:46.000')

    insert into #AssetHistory values (100014883, '128V1M1', 'Tom', 'Mower', 'Austin', '2011-09-14 11:08:10.000')

    insert into #AssetHistory values (100014883, '128V1M1', 'Tom', 'Mower', 'Austin', '2011-09-16 09:11:34.000')

    insert into #AssetHistory values (100014883, '128V1M1', 'Tom', 'Mower', 'Austin', '2011-09-18 07:19:57.000')

    ; with Dupes as

    (select EmployeeID, ServiceTag, FirstName, LastName, Location, LastUpdateDate,

    RowNo = ROW_NUMBER() over (partition by ServiceTag, EmployeeID order by ServiceTag, LastUpdateDate)

    from #AssetHistory)

    select EmployeeID, ServiceTag, FirstName, LastName, LastUpdateDate, RowNo

    from Dupes

    order by ServiceTag, LastUpdateDate

    The results are:

    EmployeeID ServiceTag FirstName LastName LastUpdateDate RowNo

    ----------- ---------- ---------- -------------------- ----------------------- --------------------

    999922222 10CR4M1 Michael Miller 2010-07-16 19:06:51.000 1

    920000186 10CR4M1 Duane Roth 2010-07-28 19:08:51.000 1

    100011036 10CR4M1 Christine Zucker 2011-01-05 19:00:52.000 1

    100010151 10CR4M1 Bruce Robinson 2011-01-07 19:01:14.000 1

    100096597 10CW4Q1 Donna Smith 2011-05-12 10:24:45.000 1

    100096597 10CW4Q1 Donna Smith 2011-08-23 10:51:22.000 2

    100096597 10CW4Q1 Donna Smith 2011-09-17 10:52:59.000 3

    100096597 10CW4Q1 Donna Smith 2011-09-18 10:25:00.000 4

    100096597 10CW4Q1 Donna Smith 2011-09-21 13:48:29.000 5

    100052969 10YGVL1 Wendie Williams 2010-05-24 14:29:36.000 1

    100051487 10YGVL1 Jennifer Duane 2010-10-22 19:02:18.000 1

    100052157 10YGVL1 Lisa Chung 2011-01-28 15:40:30.000 1

    100052157 10YGVL1 Lisa Chung 2011-08-24 16:40:33.000 2

    100052157 10YGVL1 Lisa Chung 2011-09-14 09:21:48.000 3

    100052157 10YGVL1 Lisa Chung 2011-09-16 16:40:29.000 4

    100014883 128V1M1 Tom Mower 2010-05-24 14:29:08.000 1

    100042636 128V1M1 David King 2010-08-26 19:03:24.000 1

    100014883 128V1M1 Tom Mower 2011-08-24 06:46:46.000 2

    100014883 128V1M1 Tom Mower 2011-09-14 11:08:10.000 3

    100014883 128V1M1 Tom Mower 2011-09-16 09:11:34.000 4

    100014883 128V1M1 Tom Mower 2011-09-18 07:19:57.000 5

    All assets are returning a RowNo I am expecting with the exception of the last asset, 128V1M1. Tom had the asset first and then it went to David and then it came back to Tom. I want that last return of the asset to Tom (dated 2011-08-24 06:46:46.000) to get assigned a RowNo of 1. I've tried variations in my partition but have not been able to achieve the result I want. We definitely need to maintain a chronological order of asset ownership and if an asset jumps from one person to another and then back, we need to retain that. My goal was to delete all records where RowNo <> 1 but if I do that, asset 128V1M1 will incorrectly reflect David as the current owner of the asset when in fact it should be Tom.

    Where am I going wrong?

    Thank you -

    Lisa

  • This should work for you

    ; with Dupes as

    (select DISTINCT AH1.EmployeeID, AH1.ServiceTag, AH1.FirstName, AH1.LastName, AH1.Location, AH1.LastUpdateDate

    ,AH2.EmployeeID AS PreviousOwner

    --,RowNo = ROW_NUMBER() over (partition by AH1.EmployeeID, AH1.ServiceTag,AH2.EmployeeID order by AH1.ServiceTag, AH1.LastUpdateDate Asc)

    from #AssetHistory AH1

    LEFT OUTER JOIN #AssetHistory AH2

    ON AH1.ServiceTag = AH2.ServiceTag

    AND AH1.LastUpdateDate < AH2.LastUpdateDate

    AND AH1.EmployeeID <> AH2.EmployeeID

    ), dedupe AS (

    select DISTINCT d.EmployeeID, d.ServiceTag, d.FirstName, d.LastName, d.Location, d.LastUpdateDate,ISNULL(d.PreviousOwner,d.employeeid) AS PreviousOwner

    ,RowNo = ROW_NUMBER() over (partition by d.EmployeeID, d.ServiceTag,ISNULL(d.PreviousOwner,d.employeeid) order by d.ServiceTag, d.LastUpdateDate,d.PreviousOwner Asc)

    from Dupes d

    )

    select EmployeeID, ServiceTag, FirstName, LastName, LastUpdateDate, RowNo --,PreviousOwner

    from dedupe

    order by ServiceTag, LastUpdateDate

    Jason...AKA CirqueDeSQLeil
    _______________________________________________
    I have given a name to my pain...MCM SQL Server, MVP
    SQL RNNR
    Posting Performance Based Questions - Gail Shaw[/url]
    Learn Extended Events

  • Your PARTITION BY is wrong. You want the last record for each Service_Tag, so the PARTITION BY should just be on the Service_Tag, and not on the Service_Tag, Employee_ID that you currently have.

    Drew

    J. Drew Allen
    Business Intelligence Analyst
    Philadelphia, PA

  • Beautiful! Thank you very much Jason!

  • drew.allen (10/7/2011)


    Your PARTITION BY is wrong. You want the last record for each Service_Tag, so the PARTITION BY should just be on the Service_Tag, and not on the Service_Tag, Employee_ID that you currently have.

    Drew

    If just the last record for each service tag is taken then the ownership chain would be broken. It appears they want to be able to document the change of ownership and when that change occurred.

    Jason...AKA CirqueDeSQLeil
    _______________________________________________
    I have given a name to my pain...MCM SQL Server, MVP
    SQL RNNR
    Posting Performance Based Questions - Gail Shaw[/url]
    Learn Extended Events

  • LSAdvantage (10/7/2011)


    Beautiful! Thank you very much Jason!

    You are welcome. I would recommend making changes to the underlying schema in order to show ownership change or maybe previousowner such as i did. A change such as that could simplify later queries.

    Jason...AKA CirqueDeSQLeil
    _______________________________________________
    I have given a name to my pain...MCM SQL Server, MVP
    SQL RNNR
    Posting Performance Based Questions - Gail Shaw[/url]
    Learn Extended Events

  • Performance may be poor on larger datasets, because the solution is a Triangular Join or "Hidden RBAR". For more information, see the link in SQLRNNR/Jason's signature!

    If you data isn't big enough to cause a problem, then fine, else the "quirky update" method would perform better.

  • I have 36,465 records in my table and 21,776 will be removed. And this is a one time thing, that's for sure. I'm going to fix my insert first so that I don't accumulate more of a mess when the job runs tonight.

  • paul_ramster (10/7/2011)


    Performance may be poor on larger datasets, because the solution is a Triangular Join or "Hidden RBAR". For more information, see the link in SQLRNNR/Jason's signature!

    If you data isn't big enough to cause a problem, then fine, else the "quirky update" method would perform better.

    Yup - that's right, though this would be a really acute triangular join :-D. It would be far worse if using just one or two of the join conditions I threw in there.

    Jason...AKA CirqueDeSQLeil
    _______________________________________________
    I have given a name to my pain...MCM SQL Server, MVP
    SQL RNNR
    Posting Performance Based Questions - Gail Shaw[/url]
    Learn Extended Events

Viewing 9 posts - 1 through 8 (of 8 total)

You must be logged in to reply to this topic. Login to reply