UDF to Sort Column Values into one Concatenated Key Field

  • Hello -

    New to SQL Server. Have a lot of MS Access experience and know how to do this task in Access vba but I have not been able to figure out how to replicate this task in a SQL server UDF.

    I have a qry that appends new rows to an existing table. Among the columns there are: col1, col2, col3, col4, and col5. These all have the same data type (Long) AND same source of the value (think person ID). The qry can generate many combinations of these 5 columns but in some cases, it is possible that 25 combinations are actually the same 5 values - just in different columns.

    To eliminate duplicate combinations, in vba I have a function that:

    1. puts the 5 col values into an array

    2. sorts the array

    3. then concatenates the ORDERED col values into a created field that will be appended to the Primary Key field on the target table.

    So these values:

    col1=333, col2=555, col3=111, col4=222, and col5=444 (all in the same row)

    (note: another row could have the same values in different columns: col1=222, col2=333, col3=444, col4=555, and col5=111 but would in fact be a duplicate row in the target table - as demonstrated by the resulting value below)

    would be sorted and concatenated into:

    PrimaryKey field = 111222333444555

    Any guidance would be greatly appreciated!!!

  • hreyes819 (9/22/2015)


    Hello -

    New to SQL Server. Have a lot of MS Access experience and know how to do this task in Access vba but I have not been able to figure out how to replicate this task in a SQL server UDF.

    I have a qry that appends new rows to an existing table. Among the columns there are: col1, col2, col3, col4, and col5. These all have the same data type (Long) AND same source of the value (think person ID). The qry can generate many combinations of these 5 columns but in some cases, it is possible that 25 combinations are actually the same 5 values - just in different columns.

    To eliminate duplicate combinations, in vba I have a function that:

    1. puts the 5 col values into an array

    2. sorts the array

    3. then concatenates the ORDERED col values into a created field that will be appended to the Primary Key field on the target table.

    So these values:

    col1=333, col2=555, col3=111, col4=222, and col5=444 (all in the same row)

    (note: another row could have the same values in different columns: col1=222, col2=333, col3=444, col4=555, and col5=111 but would in fact be a duplicate row in the target table - as demonstrated by the resulting value below)

    would be sorted and concatenated into:

    PrimaryKey field = 111222333444555

    Any guidance would be greatly appreciated!!!

    First, I am confused as to how the two rows would be duplicates when the data in each column is actually different from each other. Could you explain in a bit more detail how these rows are actually duplicates, besides by sorting the data in the columns and then concatenating them? That part just does not make any sense to me and doesn't fit any business use cases I can conjure up at the moment.

  • hreyes819 (9/22/2015)


    Hello -

    New to SQL Server. Have a lot of MS Access experience and know how to do this task in Access vba but I have not been able to figure out how to replicate this task in a SQL server UDF.

    I have a qry that appends new rows to an existing table. Among the columns there are: col1, col2, col3, col4, and col5. These all have the same data type (Long) AND same source of the value (think person ID). The qry can generate many combinations of these 5 columns but in some cases, it is possible that 25 combinations are actually the same 5 values - just in different columns.

    To eliminate duplicate combinations, in vba I have a function that:

    1. puts the 5 col values into an array

    2. sorts the array

    3. then concatenates the ORDERED col values into a created field that will be appended to the Primary Key field on the target table.

    So these values:

    col1=333, col2=555, col3=111, col4=222, and col5=444 (all in the same row)

    (note: another row could have the same values in different columns: col1=222, col2=333, col3=444, col4=555, and col5=111 but would in fact be a duplicate row in the target table - as demonstrated by the resulting value below)

    would be sorted and concatenated into:

    PrimaryKey field = 111222333444555

    Any guidance would be greatly appreciated!!!

    This seems incredibly contrived for a primary key value. How would you know that the generated value doesn't already exist in your table? The way to do this in sql server would be to UNPIVOT the columns into rows so you can sort them and the concatenate the rows together again. I will post an example shortly.

    _______________________________________________________________

    Need help? Help us help you.

    Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.

    Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.

    Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
    Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
    Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
    Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/

  • Or, given that (I think) you only have five columns, you could bubble sort:

    WITH MyTable (col1, col2, col3, col4, col5) AS (

    SELECT 222, 444, 555, 111, 333

    )

    , FirstSort (col1, col2, col3, col4, col5) AS (

    SELECT

    CASE WHEN col1 > col2 THEN col2 ELSE col1 END

    ,CASE WHEN col1 > col2 THEN col1 ELSE col2 END

    ,CASE WHEN col3 > col4 THEN col4 ELSE col3 END

    ,CASE WHEN col3 > col4 THEN col3 ELSE col4 END

    ,col5

    FROM MyTable

    )

    , SecondSort (col1, col2, col3, col4, col5) AS (

    SELECT

    col1

    ,CASE WHEN col2 > col3 THEN col3 ELSE col2 END

    ,CASE WHEN col2 > col3 THEN col2 ELSE col3 END

    ,CASE WHEN col4 > col5 THEN col5 ELSE col4 END

    ,CASE WHEN col4 > col5 THEN col4 ELSE col5 END

    FROM FirstSort

    )

    , ThirdSort (col1, col2, col3, col4, col5) AS (

    SELECT

    CASE WHEN col1 > col2 THEN col2 ELSE col1 END

    ,CASE WHEN col1 > col2 THEN col1 ELSE col2 END

    ,CASE WHEN col3 > col4 THEN col4 ELSE col3 END

    ,CASE WHEN col3 > col4 THEN col3 ELSE col4 END

    ,col5

    FROM SecondSort

    )

    , FourthSort (col1, col2, col3, col4, col5) AS (

    SELECT

    col1

    ,CASE WHEN col2 > col3 THEN col3 ELSE col2 END

    ,CASE WHEN col2 > col3 THEN col2 ELSE col3 END

    ,CASE WHEN col4 > col5 THEN col5 ELSE col4 END

    ,CASE WHEN col4 > col5 THEN col4 ELSE col5 END

    FROM ThirdSort

    )

    SELECT

    CAST(col1 AS varchar(6)) +

    CAST(col2 AS varchar(6)) +

    CAST(col3 AS varchar(6)) +

    CAST(col4 AS varchar(6)) +

    CAST(col5 AS varchar(6))

    FROM FourthSort

    John

  • While the real world application of this is totally elusive the actual code is quite simple.

    if OBJECT_ID('tempdb..#something') is not null

    drop table #something

    create table #something

    (

    col1 int

    , col2 int

    , col3 int

    , col4 int

    , col5 int

    )

    insert #something

    select 333, 555, 111, 222, 444

    declare @MyNewValue varchar(100) = ''

    select @MyNewValue = @MyNewValue + CAST(u.myCols as varchar(10))

    from #something s

    unpivot (myCols for field in (col1, col2, col3, col4, col5)) u

    order by u.myCols

    select @MyNewValue

    _______________________________________________________________

    Need help? Help us help you.

    Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.

    Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.

    Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
    Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
    Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
    Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/

  • While waiting for a more detailed explanation I did throw the following together. This doesn't let you off the hook for a more detailed explanation. I'd like to know the business case for what you are doing.

    create function dbo.ConcatSort( @p1 int, @p2 int, @p3 int, @p4 int, @p5 int)

    returns table

    as return (

    with basedata as (

    select val from

    (values (@p1),(@p2),(@p3),(@p4),(@p5))dt(val)

    )

    select stuff((select cast(val as varchar) sortval from basedata order by val for xml path(''),TYPE).value('.','varchar(max)'),1,0,'') sortval

    );

    go

    declare @test-2 table (

    col1 int,

    col2 int,

    col3 int,

    col4 int,

    col5 int);

    insert into @test-2

    values (333,555,111,222,444),(222,333,444,555,111);

    select * from @test-2;

    select * from @test-2 cross apply dbo.ConcatSort(col1,col2,col3,col4,col5) order by sortval;

    go

  • Let me try this:

    Col1 - has values that originate from table 1

    Col2 - has values that originate from table 2*

    Col3 - has values that originate from table 2*

    Col4 - has values that originate from table 3**

    Col5 - has values that originate from table 3**

    Col6 - has values that originate from table 3**

    Col7 - has values that originate from table 4

    Col8 - has values that can originate from either table: 2, 3, or 4

    Col9 - has values that originate from table 5

    Now, tables 1-5 are ALL a subset of table 0 (just grouped by a category). Hence the values stored in col1-9 are of the same type but unique. But still, when the append qry runs (for example), cols 4-6 can end up with the same 3 values just in different columns (***in 3 different rows***).

    In MS Access appending a duplicate keys is ignored when running an append qry. And this is what I'm trying to accomplish in this case.

  • Sean, we will need to run some tests between our respective answers.

  • Lynn Pettis (9/22/2015)


    Sean, we will need to run some tests between our respective answers.

    Agreed. I have a feeling yours might win out because iirc the unpivot can get a bit slow but for only 5 columns it might be ok.

    _______________________________________________________________

    Need help? Help us help you.

    Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.

    Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.

    Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
    Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
    Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
    Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/

  • hreyes819 (9/22/2015)


    Let me try this:

    Col1 - has values that originate from table 1

    Col2 - has values that originate from table 2*

    Col3 - has values that originate from table 2*

    Col4 - has values that originate from table 3**

    Col5 - has values that originate from table 3**

    Col6 - has values that originate from table 3**

    Col7 - has values that originate from table 4

    Col8 - has values that can originate from either table: 2, 3, or 4

    Col9 - has values that originate from table 5

    Now, tables 1-5 are ALL a subset of table 0 (just grouped by a category). Hence the values stored in col1-9 are of the same type but unique. But still, when the append qry runs (for example), cols 4-6 can end up with the same 3 values just in different columns (***in 3 different rows***).

    In MS Access appending a duplicate keys is ignored when running an append qry. And this is what I'm trying to accomplish in this case.

    Why do you have to cram all this into a single column to make your primary key? This sounds like a perfect time for a composite key across those columns. Or make that combination of columns a unique clustered index and add an identity column as a nonclustered primary key. That would be a lot simpler than jumping through these hoops to derive a subset of values that exist in another column already.

    _______________________________________________________________

    Need help? Help us help you.

    Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.

    Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.

    Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
    Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
    Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
    Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/

  • hreyes819 (9/22/2015)


    Let me try this:

    Col1 - has values that originate from table 1

    Col2 - has values that originate from table 2*

    Col3 - has values that originate from table 2*

    Col4 - has values that originate from table 3**

    Col5 - has values that originate from table 3**

    Col6 - has values that originate from table 3**

    Col7 - has values that originate from table 4

    Col8 - has values that can originate from either table: 2, 3, or 4

    Col9 - has values that originate from table 5

    Now, tables 1-5 are ALL a subset of table 0 (just grouped by a category). Hence the values stored in col1-9 are of the same type but unique. But still, when the append qry runs (for example), cols 4-6 can end up with the same 3 values just in different columns (***in 3 different rows***).

    In MS Access appending a duplicate keys is ignored when running an append qry. And this is what I'm trying to accomplish in this case.

    Still too vague to really understand the business case. If you can put together an example using sample (read, not production) data that can show us what you are doing would really help.

    Also, your original post only mentioned five columns and what you posted goes to nine, so which is it, concatenate five columns or concatenate nine columns?

  • Sean Lange (9/22/2015)


    Lynn Pettis (9/22/2015)


    Sean, we will need to run some tests between our respective answers.

    Agreed. I have a feeling yours might win out because iirc the unpivot can get a bit slow but for only 5 columns it might be ok.

    Yes, but I need to wrap your code into an itvf first.

    And I must be brain dead as I can't see how to do it and keep it the way you wrote it.

  • It seems that you need to review the way you create the primary key.

    Here's an example on the failure and a possible workaround.

    You mentioned you're using long data types, which mean that you need 20 spaces for each column.

    CREATE TABLE #WeirdTable(

    RowID int IDENTITY(1,1),

    PrimaryKey varchar( 100), --20 possible digits * 5

    col1 bigint,

    col2 bigint,

    col3 bigint,

    col4 bigint,

    col5 bigint

    )

    INSERT INTO #WeirdTable(

    col1, col2, col3, col4, col5)

    VALUES(333, 555, 111, 222, 444),

    (222, 333, 444, 555, 111),

    (1, 11, 2223, 3344, 4555)

    SELECT *,

    (SELECT col + 0

    FROM (VALUES(col1), (col2), (col3), (col4), (col5))x(col)

    ORDER BY col

    FOR XML PATH('')

    )

    FROM #WeirdTable o

    SELECT *,

    (SELECT CAST( col AS char(20))

    FROM (VALUES(col1), (col2), (col3), (col4), (col5))x(col)

    ORDER BY col

    FOR XML PATH('')

    )

    FROM #WeirdTable o

    GO

    DROP TABLE #WeirdTable

    Luis C.
    General Disclaimer:
    Are you seriously taking the advice and code from someone from the internet without testing it? Do you at least understand it? Or can it easily kill your server?

    How to post data/code on a forum to get the best help: Option 1 / Option 2
  • I have a question for the respondents. Why am I the only one that actually created an itvf? The OP did ask for a function.

  • Lynn Pettis (9/22/2015)


    Sean Lange (9/22/2015)


    Lynn Pettis (9/22/2015)


    Sean, we will need to run some tests between our respective answers.

    Agreed. I have a feeling yours might win out because iirc the unpivot can get a bit slow but for only 5 columns it might be ok.

    Yes, but I need to wrap your code into an itvf first.

    And I must be brain dead as I can't see how to do it and keep it the way you wrote it.

    Not brain dead at all. If you keep it the same as I wrote it the performance will be poor for a table valued function because I am using the "variable = variable + newvalue" trick. In a function you would have to declare that variable and then the select to populate followed by the final return which would kill the performance because it would no longer be inline. It would become a multi-statment table valued function at which point a scalar would be no better. To turn it into an itvf it would end up being your code. 😉

    _______________________________________________________________

    Need help? Help us help you.

    Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.

    Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.

    Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
    Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
    Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
    Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/

Viewing 15 posts - 1 through 15 (of 21 total)

You must be logged in to reply to this topic. Login to reply