Get one type with max id

  • I need to get the max address id for each client in this order:

    Billing

    Mailing

    Home

    [Null]

    So if they have a Billing address type, I need the maxid for that address. If they don't have a Billing, I need the Mailing address, and so on. But only one address id per client.

    Mock table data:

    USE [tempdb]

    GO

    CREATE TABLE [dbo].[addresses](

    [AddID] [int] NOT NULL PRIMARY KEY CLUSTERED,

    [ClientId] [int] NOT NULL,

    [AddrType] [nvarchar](10) NOT NULL

    )

    GO

    INSERT INTO addresses VALUES (1001, 9999, 'Mailing')

    INSERT INTO addresses VALUES (1002, 9999, 'Billing')

    INSERT INTO addresses VALUES (1003, 9999, 'Work')

    INSERT INTO addresses VALUES (1004, 9999, 'Billing')

    INSERT INTO addresses VALUES (1005, 9998, 'Mailing')

    INSERT INTO addresses VALUES (1006, 9998, 'Home')

    INSERT INTO addresses VALUES (1007, 9997, '')

    This is what I want to see:

    +-------+----------+----------+

    | AddID | ClientID | AddrType |

    +-------+----------+----------+

    | 1004 | 9999 | Billing |

    | 1005 | 9998 | Mailing |

    | 1007 | 9997 | |

    +-------+----------+----------+

    I've tried many things except/intersect/union, subqueries, etc. Here's a recent work-in-progress:

    select

    *

    from

    (select

    max(AddID) as 'MaxAddID',

    ClientId,

    AddrType

    from

    addresses

    group by

    ClientId,

    AddrType) Keys

    left join addresses addresses on Keys.MaxAddID = addresses.AddID

    where

    addresses.AddrType = 'Billing'

    order by

    addresses.ClientId

    Help, please. Thanks!

  • Here is my solution:

    with basedata as (

    select

    rn = row_number() over (partition by ClientId order by case when AddrType = 'Billing' then 1

    when AddrType = 'Mailing' then 2

    when AddrType = 'Home' then 3

    else 4

    end, AddID desc),

    AddID,

    ClientId,

    AddrType

    from

    dbo.addresses

    )

    select

    AddID,

    ClientId,

    AddrType

    from

    basedata

    where

    rn = 1

    order by

    ClientId desc;

    Look it over and ask any questions you need to understand what I did.

  • Works great! Thank you very much!

    One of my many iterations was along those lines but not nearly as sophisticated. My boss thought it should be done with a cursor, but I knew there had to be a better way.

    Thanks again!

  • Glad it worked.

  • Just noticed that you have an entry for Work but tat wasn't included in you priority sort order in the original post. Where do you want that to be in the sort order?

  • The problem with Lynn's fine solution is that it isn't able to utilise an index on the Address Type, hence a relatively expensive sort operator is introduced in the execution plan. With a little bit of creativity we can eliminate the sort and lower the execution cost about 3/4.

    😎

    USE TEEST;

    GO

    SET NOCOUNT ON;

    IF OBJECT_ID(N'dbo.addresses') IS NOT NULL DROP TABLE dbo.addresses;

    CREATE TABLE [dbo].[addresses](

    [AddID] [int] NOT NULL PRIMARY KEY CLUSTERED,

    [ClientId] [int] NOT NULL,

    [AddrType] [nvarchar](10) NOT NULL

    )

    INSERT INTO addresses VALUES (1001, 9999, 'Mailing')

    ,(1002, 9999, 'Billing')

    ,(1003, 9999, 'Work')

    ,(1004, 9999, 'Billing')

    ,(1005, 9998, 'Mailing')

    ,(1006, 9998, 'Home')

    ,(1007, 9997, '');

    CREATE NONCLUSTERED INDEX NCLIDX_DBO_ADDRESSES_CLIENT_ID_ADDRTYPE_INCL_ADDID ON dbo.addresses (ClientId ASC, AddrType ASC) INCLUDE (AddID)

    ;WITH BASE_DATA AS

    (

    SELECT

    AD.AddID

    ,AD.ClientId

    ,AD.AddrType

    ,CASE

    WHEN AD.AddrType = 'Billing' THEN 1

    WHEN AD.AddrType = 'Mailing' THEN 2

    WHEN AD.AddrType = 'Work' THEN 3

    WHEN AD.AddrType = 'Home' THEN 4

    ELSE 5

    END AS TRNK

    FROM dbo.addresses AD

    )

    ,RANKED_DATA AS

    (

    SELECT

    BD.AddID

    ,BD.ClientId

    ,BD.AddrType

    ,BD.TRNK

    ,MIN(BD.TRNK) OVER

    (

    PARTITION BY BD.ClientId

    ) AS MRNK

    FROM BASE_DATA BD

    )

    ,FINAL_RESULTS AS

    (

    SELECT

    RD.AddID

    ,RD.ClientId

    ,RD.AddrType

    ,ROW_NUMBER() OVER

    (

    PARTITION BY RD.ClientId

    ORDER BY (SELECT NULL)

    ) AS TRID

    FROM RANKED_DATA RD

    WHERE RD.TRNK = RD.MRNK

    )

    SELECT

    FR.AddID

    ,FR.ClientId

    ,FR.AddrType

    FROM FINAL_RESULTS FR

    WHERE FR.TRID = 1;

    IO and Time statistics

    --WITHOUT SORT

    Table 'Worktable'. Scan count 3, logical reads 27, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    Table 'addresses'. Scan count 1, logical reads 2, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    SQL Server Execution Times:

    CPU time = 32 ms, elapsed time = 102 ms.

    --WITH SORT

    Table 'addresses'. Scan count 1, logical reads 2, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    SQL Server Execution Times:

    CPU time = 0 ms, elapsed time = 376 ms.

  • Nice! Thanks!

  • Eirikur Eiriksson (8/4/2016)


    The problem with Lynn's fine solution is that it isn't able to utilise an index on the Address Type, hence a relatively expensive sort operator is introduced in the execution plan. With a little bit of creativity we can eliminate the sort and lower the execution cost about 3/4.

    😎

    USE TEEST;

    GO

    SET NOCOUNT ON;

    IF OBJECT_ID(N'dbo.addresses') IS NOT NULL DROP TABLE dbo.addresses;

    CREATE TABLE [dbo].[addresses](

    [AddID] [int] NOT NULL PRIMARY KEY CLUSTERED,

    [ClientId] [int] NOT NULL,

    [AddrType] [nvarchar](10) NOT NULL

    )

    INSERT INTO addresses VALUES (1001, 9999, 'Mailing')

    ,(1002, 9999, 'Billing')

    ,(1003, 9999, 'Work')

    ,(1004, 9999, 'Billing')

    ,(1005, 9998, 'Mailing')

    ,(1006, 9998, 'Home')

    ,(1007, 9997, '');

    CREATE NONCLUSTERED INDEX NCLIDX_DBO_ADDRESSES_CLIENT_ID_ADDRTYPE_INCL_ADDID ON dbo.addresses (ClientId ASC, AddrType ASC) INCLUDE (AddID)

    ;WITH BASE_DATA AS

    (

    SELECT

    AD.AddID

    ,AD.ClientId

    ,AD.AddrType

    ,CASE

    WHEN AD.AddrType = 'Billing' THEN 1

    WHEN AD.AddrType = 'Mailing' THEN 2

    WHEN AD.AddrType = 'Work' THEN 3

    WHEN AD.AddrType = 'Home' THEN 4

    ELSE 5

    END AS TRNK

    FROM dbo.addresses AD

    )

    ,RANKED_DATA AS

    (

    SELECT

    BD.AddID

    ,BD.ClientId

    ,BD.AddrType

    ,BD.TRNK

    ,MIN(BD.TRNK) OVER

    (

    PARTITION BY BD.ClientId

    ) AS MRNK

    FROM BASE_DATA BD

    )

    ,FINAL_RESULTS AS

    (

    SELECT

    RD.AddID

    ,RD.ClientId

    ,RD.AddrType

    ,ROW_NUMBER() OVER

    (

    PARTITION BY RD.ClientId

    ORDER BY (SELECT NULL)

    ) AS TRID

    FROM RANKED_DATA RD

    WHERE RD.TRNK = RD.MRNK

    )

    SELECT

    FR.AddID

    ,FR.ClientId

    ,FR.AddrType

    FROM FINAL_RESULTS FR

    WHERE FR.TRID = 1;

    Hrm. So I ran both sets of code and I see Eirikur's code is getting a different answer than Lynn's code is.

    Lynn is getting:

    AddIDClientIdAddrType

    10049999Billing

    10059998Mailing

    10079997

    And Eirikur is getting:

    AddIDClientIdAddrType

    10079997

    10059998Mailing

    10029999Billing

    So the issue here is which Billing address does the OP need? The most recent one (i.e., highest AddID) or the oldest or just any old Billing address?

    In my office, we'd want the highest ID as the most recent. So I would change the ORDER BY in the last CTE of Eirikur's code to be as below:

    WITH BASE_DATA AS

    (

    SELECT

    AD.AddID

    ,AD.ClientId

    ,AD.AddrType

    ,CASE

    WHEN AD.AddrType = 'Billing' THEN 1

    WHEN AD.AddrType = 'Mailing' THEN 2

    WHEN AD.AddrType = 'Work' THEN 3

    WHEN AD.AddrType = 'Home' THEN 4

    ELSE 5

    END AS TRNK

    FROM dbo.addresses AD

    )

    ,RANKED_DATA AS

    (

    SELECT

    BD.AddID

    ,BD.ClientId

    ,BD.AddrType

    ,BD.TRNK

    ,MIN(BD.TRNK) OVER

    (

    PARTITION BY BD.ClientId

    ) AS MRNK

    FROM BASE_DATA BD

    )

    ,FINAL_RESULTS AS

    (

    SELECT

    RD.AddID

    ,RD.ClientId

    ,RD.AddrType

    ,ROW_NUMBER() OVER

    (

    PARTITION BY RD.ClientId

    ORDER BY (SELECT RD.AddID) DESC

    ) AS TRID

    FROM RANKED_DATA RD

    WHERE RD.TRNK = RD.MRNK

    )

    SELECT

    FR.AddID

    ,FR.ClientId

    ,FR.AddrType

    FROM FINAL_RESULTS FR

    WHERE FR.TRID = 1;

    EDIT: fixed messed up quote tags.

    Brandie Tarvin, MCITP Database AdministratorLiveJournal Blog: http://brandietarvin.livejournal.com/[/url]On LinkedIn!, Google+, and Twitter.Freelance Writer: ShadowrunLatchkeys: Nevermore, Latchkeys: The Bootleg War, and Latchkeys: Roscoes in the Night are now available on Nook and Kindle.

  • Nice input Brandy, appreciate it:-) Maybe the OP can enlighten us on the issue?

    😎

  • Eirikur Eiriksson (8/4/2016)


    Nice input Brandy, appreciate it:-) Maybe the OP can enlighten us on the issue?

    😎

    There's also the issue of the final sort order. Lynn's results are sorted differently than yours so I don't know how that would affect the stats you pulled when you put both fixes in.

    SELECT

    FR.AddID

    ,FR.ClientId

    ,FR.AddrType

    FROM FINAL_RESULTS FR

    WHERE FR.TRID = 1

    ORDER BY FR.ClientId

    But the question is whether or not the OP needs the results sorted the way provided in the original post.

    Brandie Tarvin, MCITP Database AdministratorLiveJournal Blog: http://brandietarvin.livejournal.com/[/url]On LinkedIn!, Google+, and Twitter.Freelance Writer: ShadowrunLatchkeys: Nevermore, Latchkeys: The Bootleg War, and Latchkeys: Roscoes in the Night are now available on Nook and Kindle.

  • Actually, I think my solution is quite viable. I created a 1,000,000 row test to compare the two solutions. I also noticed a difference between them where Eirikur's solution returned the minimum value for AddID and mine the maximum.

    if object_id('dbo.eTally') is not null

    DROP FUNCTION [dbo].[eTally]

    GO

    /****** Object: UserDefinedFunction [dbo].[eTally] Script Date: 8/4/2016 9:32:22 AM ******/

    SET ANSI_NULLS ON

    GO

    SET QUOTED_IDENTIFIER ON

    GO

    create function [dbo].[eTally] (@Range int, @StartZero bit)

    returns table

    with schemabinding

    as return(

    with e1(n) as (select 1 from (values (1),(1),(1),(1),(1),(1),(1),(1),(1),(1))dt(n)),

    e2(n) as (select 1 from e1 a cross join e1 b),

    e4(n) as (select 1 from e2 a cross join e2 b),

    etally(n) as (select n = row_number() over (order by (select null)) from e4 a cross join e4 b)

    select top (@Range)

    n - cast(@StartZero as int) as n

    from

    etally);

    GO

    SET NOCOUNT ON;

    IF OBJECT_ID(N'dbo.addresses') IS NOT NULL DROP TABLE dbo.addresses;

    CREATE TABLE [dbo].[addresses](

    [AddID] [int] NOT NULL PRIMARY KEY CLUSTERED,

    [ClientId] [int] NOT NULL,

    [AddrType] [nvarchar](10) NOT NULL

    )

    --INSERT INTO addresses VALUES (1001, 9999, 'Mailing')

    -- ,(1002, 9999, 'Billing')

    -- ,(1003, 9999, 'Work')

    -- ,(1004, 9999, 'Billing')

    -- ,(1005, 9998, 'Mailing')

    -- ,(1006, 9998, 'Home')

    -- ,(1007, 9997, '');

    --===== Declare some obviously named variables

    DECLARE @NumberOfRows INT,

    @StartValue INT,

    @EndValue INT,

    @Range INT

    ;

    --===== Preset the variables to known values

    SELECT @NumberOfRows = 1000000,

    @StartValue = 1000,

    @EndValue = 9999,

    @Range = @EndValue - @StartValue + 1

    ;

    --===== Conditionally drop the test table to make reruns easier in SSMS

    IF OBJECT_ID('tempdb..#SomeTestTable','U') IS NOT NULL

    DROP TABLE #SomeTestTable

    ;

    --===== Create the test table with "random constrained" integers and floats

    -- within the parameters identified in the variables above.

    INSERT INTO addresses

    SELECT TOP (@NumberOfRows)

    AddId = n + 1000,

    SomeRandomInteger = ABS(CHECKSUM(NEWID())) % @Range + @StartValue,

    SomeRandomValue = case ABS(CHECKSUM(NEWID())) % 5 + 1 when 1 then 'Billing' when 2 then 'Mailing' when 3 then 'Work' when 4 then 'Home' else '' end

    -- INTO #SomeTestTable

    FROM dbo.eTally(@NumberOfRows,0);

    CREATE NONCLUSTERED INDEX NCLIDX_DBO_ADDRESSES_CLIENT_ID_ADDRTYPE_INCL_ADDID ON dbo.addresses (ClientId ASC, AddrType ASC) INCLUDE (AddID);

    raiserror('Eirikur Eiriksson Solution Starting',10,1) with nowait;

    set statistics io,time on;

    WITH BASE_DATA AS

    (

    SELECT

    AD.AddID

    ,AD.ClientId

    ,AD.AddrType

    ,CASE

    WHEN AD.AddrType = 'Billing' THEN 1

    WHEN AD.AddrType = 'Mailing' THEN 2

    WHEN AD.AddrType = 'Work' THEN 3

    WHEN AD.AddrType = 'Home' THEN 4

    ELSE 5

    END AS TRNK

    FROM dbo.addresses AD

    )

    ,RANKED_DATA AS

    (

    SELECT

    BD.AddID

    ,BD.ClientId

    ,BD.AddrType

    ,BD.TRNK

    ,MIN(BD.TRNK) OVER

    (

    PARTITION BY BD.ClientId

    ) AS MRNK

    FROM BASE_DATA BD

    )

    ,FINAL_RESULTS AS

    (

    SELECT

    RD.AddID

    ,RD.ClientId

    ,RD.AddrType

    ,ROW_NUMBER() OVER

    (

    PARTITION BY RD.ClientId

    ORDER BY (SELECT NULL)

    ) AS TRID

    FROM RANKED_DATA RD

    WHERE RD.TRNK = RD.MRNK

    )

    SELECT

    FR.AddID

    ,FR.ClientId

    ,FR.AddrType

    FROM FINAL_RESULTS FR

    WHERE FR.TRID = 1;

    set statistics io,time off;

    raiserror('Eirikur Eiriksson Solution Ending',10,1) with nowait;

    raiserror('Lynn Pettis Solution Starting',10,1) with nowait;

    set statistics io,time on;

    with basedata as (

    select

    rn = row_number() over (partition by ClientId order by case when AddrType = 'Billing' then 1

    when AddrType = 'Mailing' then 2

    when AddrType = 'Work' then 3

    when AddrType = 'Home' then 4

    else 5

    end, AddID desc),

    AddID,

    ClientId,

    AddrType

    from

    dbo.addresses

    )

    select

    AddID,

    ClientId,

    AddrType

    from

    basedata

    where

    rn = 1

    order by

    ClientId;

    set statistics io,time off;

    raiserror('Lynn Pettis Solution Ending',10,1) with nowait;

    select

    *

    from

    dbo.addresses

    order by

    ClientId asc,

    case when AddrType = 'Billing' then 1

    when AddrType = 'Mailing' then 2

    when AddrType = 'Work' then 3

    when AddrType = 'Home' then 4

    else 5

    end,

    AddID desc;

    This is the output of the statistics io and time when I ran it on my laptop.

    Eirikur Eiriksson Solution Starting

    SQL Server parse and compile time:

    CPU time = 0 ms, elapsed time = 1 ms.

    Table 'Worktable'. Scan count 3, logical reads 2036001, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    Table 'addresses'. Scan count 1, logical reads 2898, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    SQL Server Execution Times:

    CPU time = 1700 ms, elapsed time = 1708 ms.

    Eirikur Eiriksson Solution Ending

    Lynn Pettis Solution Starting

    SQL Server parse and compile time:

    CPU time = 1 ms, elapsed time = 1 ms.

    Table 'addresses'. Scan count 9, logical reads 2956, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    SQL Server Execution Times:

    CPU time = 1608 ms, elapsed time = 333 ms.

    Lynn Pettis Solution Ending

    Also, if you pull the execution plans you will see that both queries use the nonclustered index.

  • The "Work" option is not to be included in the data. They only want Billing, Mailing, Home, blank (because address type isn't required).

  • Sorry, just saw the posts with questions about the sort order.

    We DO want the most recent of a given address type, i.e., the highest AddrID, which is what Lynn's code returns. I pulled the code apart (because I haven't used partitions that much and wanted to understand it better) and tested it pretty thoroughly.

    And in another response, I mentioned that "Work" is not to be included in the returned data, though a blank address type is. I updated the code to exclude "Work" types.

    So in this scenario (added some data to orignal):

    IF OBJECT_ID(N'dbo.addresses') IS NOT NULL DROP TABLE dbo.addresses;

    CREATE TABLE [dbo].[addresses](

    [AddID] [int] NOT NULL PRIMARY KEY CLUSTERED,

    [ClientId] [int] NOT NULL,

    [AddrType] [nvarchar](10) NOT NULL

    )

    INSERT INTO addresses VALUES (1001, 9999, 'Mailing')

    ,(1002, 9999, 'Billing')

    ,(1003, 9999, 'Work')

    ,(1004, 9999, 'Billing')

    ,(1005, 9998, 'Mailing')

    ,(1006, 9998, 'Home')

    ,(1007, 9997, 'Work')

    ,(1008, 9997, '')

    ,(1009, 9996, 'Work');

    I want records 1004, 1005 and 1007, but not 1009.

  • Reworking my test suite, as it generated all 5 address type for everyone.

  • I added a where clause to filter out 'Work' and a case for ''. Seems to be working just fine.

    USE [tempdb]

    GO

    IF OBJECT_ID(N'dbo.addresses') IS NOT NULL DROP TABLE dbo.addresses;

    CREATE TABLE [dbo].[addresses](

    [AddID] [int] NOT NULL PRIMARY KEY CLUSTERED,

    [ClientId] [int] NOT NULL,

    [AddrType] [nvarchar](10) NOT NULL

    )

    GO

    INSERT INTO addresses VALUES (1001, 9999, 'Mailing')

    ,(1002, 9999, 'Billing')

    ,(1003, 9999, 'Work')

    ,(1004, 9999, 'Billing')

    ,(1005, 9998, 'Mailing')

    ,(1006, 9998, 'Home')

    ,(1007, 9997, '')

    ,(1008, 9997, 'Work')

    ,(1009, 9996, 'Work')

    go

    with basedata as (

    select

    rn = row_number() over (partition by ClientId order by case when AddrType = 'Billing' then 1

    when AddrType = 'Mailing' then 2

    when AddrType = 'Home' then 3

    when AddrType = ''then 4

    else 999

    end, AddID desc),

    AddID,

    ClientId,

    AddrType

    from

    dbo.addresses

    where

    AddrType <> 'Work'

    )

    select

    AddID,

    ClientId,

    AddrType

    from

    basedata

    where

    rn = 1

    order by

    ClientId desc;

Viewing 15 posts - 1 through 15 (of 17 total)

You must be logged in to reply to this topic. Login to reply