Trying to UnPivot without using multiple selects unioned together

  • I've got a table that is populated from the OUTPUT from an update statement:

    declare @Update TABLE (

    RowID INT,

    Column1_old DECIMAL(2,0),

    Column1_new DECIMAL(2,0),

    Column2_old DATETIME,

    Column2_new DATETIME,

    Column3_old BIT,

    Column3_new BIT)

    insert into @Update

    select 1, 25, 52, '20090101', '20091001', 1, 1 UNION

    select 2, 82, 56, '20090101', '20090101', 1, 0 UNION

    select 3, 25, 25, '20091001', '20091001', 1, 1

    -- what the incoming data looks like

    select * from @Update

    I'd like to unpivot the different columns so that it looks like this:

    -- desired results:

    declare @Updates TABLE (RowID INT, Column_nm varchar(30), old_value varchar(30), new_value varchar(30))

    insert into @Updates

    select 1, 'Column1', 25, 52 UNION

    select 1, 'Column2', '20090101','20091001' UNION

    select 1, 'Column3', 1, 1 UNION

    select 2, 'Column1', 82, 56 UNION

    select 2, 'Column2', '20090101','20090101' UNION

    select 2, 'Column3', 1, 0 UNION

    select 3, 'Column1', 25, 25 UNION

    select 3, 'Column2', '20090101','20091001' UNION

    select 3, 'Column3', 1, 1

    select * from @Updates order by RowID, Column_nm

    The only solution that I've been able to come up with uses a select for each pair of columns, all unioned together:

    declare @Updates2 TABLE (RowID INT, Column_nm varchar(30), old_value varchar(30), new_value varchar(30))

    insert into @Updates2

    select RowID, 'Column1', convert(varchar(30),column1_old), convert(varchar(30), column1_new) from @Update UNION

    select RowID, 'Column2', convert(char(8), column2_old, 112), convert(char(8), column2_new, 112) from @Update UNION

    select RowID, 'Column3', convert(char(1),column3_old), convert(char(1),column3_new) from @Update

    select * from @Updates2 order by RowID, Column_nm

    But, I'd like to avoid the repetitive hits against the table (especially when the actual output is about 30 pairs of columns).

    Do you have any idea of if this can be accomplished, and if so, how?

    Thanks!!!

    Wayne
    Microsoft Certified Master: SQL Server 2008
    Author - SQL Server T-SQL Recipes


    If you can't explain to another person how the code that you're copying from the internet works, then DON'T USE IT on a production system! After all, you will be the one supporting it!
    Links:
    For better assistance in answering your questions
    Performance Problems
    Common date/time routines
    Understanding and Using APPLY Part 1 & Part 2

  • To my amazement, this can be done. I sort of spaced out while playing with this and when I regained consciousness here it was. The query plan shows only a single scan of the source table (@update).

    Let me know if you have any questions, Wayne.

    -- cte is necessary to make all columns compatible datatypes.

    -- if date format isn't what you want to see, add CONVERT()

    ;with cte (rowid,column1_old,column1_new,column2_old,column2_new,column3_old,column3_new) as

    (select rowId, cast(column1_old as varchar),cast(column1_new as varchar)

    ,cast(column2_old as varchar),cast(column2_new as varchar),cast(column3_old as varchar)

    ,cast(column3_new as varchar)

    from @update)

    /*

    UNPIVOT allows us to write out three rows for 1, but is limited to combining only a single

    set of columns into one column. BUT, combined with old-school cross-tabbing using a

    CASE expression, it gets the job done.

    In the query below, we UNPIVOT to get the "New" column" but crosstab to get the "Old" column.

    */

    select rowID,left(c1,7) as column_nm

    ,case when left(c1,7) = 'column1' then column1_old

    when left(c1,7) = 'column2' then column2_old

    when left(c1,7) = 'column3' then column3_old

    end as old

    ,p1 as new

    FROM

    (SELECT *

    FROM cte) AS p

    UNPIVOT

    (p1 FOR c1 IN

    (column1_new,column2_new,column3_new)

    )AS unpvt

    order by rowID,left(c1,7)

    I think this is worth filing away for future use when you want to convert a single row into multiple rows, and UNPIVOT alone isn't enough. The code might look clunky, but it's the best execution plan I've seen for that purpose

    __________________________________________________

    Against stupidity the gods themselves contend in vain. -- Friedrich Schiller
    Stop, children, what's that sound? Everybody look what's going down. -- Stephen Stills

  • I have seen how fast is this technique for the first time during the "Phil Factor SQL Speed Phreak Competition: No 1".

    It was Peso that came up with the UNPIVOT solution and, with my greatest surprise, it did a single scan on the table, with a very good execution time.

    I would have expected UNPIVOT to perform something like a join or union internally, but it doesn't seem to need any of that.

    -- Gianluca Sartori

  • A nice and surprising solution by Bob (it took me some time to see where c1 in left(c1, 7) was coming from). All I could come up with was:

    select

    RowID,

    case

    when column_no = 1 then

    'Column1'

    when column_no = 2 then

    'Column2'

    when column_no = 3 then

    'Column3'

    end Column_nm,

    case

    when column_no = 1 then

    cast(Column1_old as sql_variant)

    when column_no = 2 then

    cast(Column2_old as sql_variant)

    when column_no = 3 then

    cast(Column3_old as sql_variant)

    end old_value,

    case

    when column_no = 1 then

    cast(Column1_new as sql_variant)

    when column_no = 2 then

    cast(Column2_new as sql_variant)

    when column_no = 3 then

    cast(Column3_new as sql_variant)

    end new_value

    from

    @Update

    cross join

    (

    select 1 column_no union all select 2 union all select 3

    ) n

    order by

    1, 2

    To my surprise both solutions produce nearly the same query plan. But if you have 30 pairs of columns, as the OP mentioned I should definitely stay with Bob's solution.

    Peter

  • DOH!!

    Peter, I've been using Cross-Joins to expand tables forever and it never occurred to me last night. I think that your solution has serious merit and that Wayne should test both with his 30 column table to compare the performance with real world volumes. UNPIVOT may involve some overhead that your solution avoids. Frankly, I find yours easier to read and understand at a glance. It doesn't blend concepts. Nice work!!

    Gianluca, I've seen Phil's competitions posted, but haven't had the time to study the winning solutions. I should probably make time to do that because I'm sure they are producing some truly ninja techniques.:-P

    __________________________________________________

    Against stupidity the gods themselves contend in vain. -- Friedrich Schiller
    Stop, children, what's that sound? Everybody look what's going down. -- Stephen Stills

  • Hey guys, this looks awesome. I had pushed the UNPIVOT out of my mind 'cause I didn't think it would work with multiple sets of columns.

    I'll post some results as soon as I can wrap my head around all of this, get it coded, and see what the test looks like. However, if you're talking about a single scan of the table, that sure beats 30 scans, once for each pair. Which is what I was trying to avoid.

    Thanks for the help!

    Wayne
    Microsoft Certified Master: SQL Server 2008
    Author - SQL Server T-SQL Recipes


    If you can't explain to another person how the code that you're copying from the internet works, then DON'T USE IT on a production system! After all, you will be the one supporting it!
    Links:
    For better assistance in answering your questions
    Performance Problems
    Common date/time routines
    Understanding and Using APPLY Part 1 & Part 2

  • You're welcome, Wayne. Please let us know which you finally go with and why. If one is significantly faster than the other, I'd like to know. I'm actually hoping that Peter's cross join wins, just because I'm sentimental about old-school techniques getting the job done. Worst case, we have a good technique for SQL2000 and a good technique for SQL25K and beyond.

    __________________________________________________

    Against stupidity the gods themselves contend in vain. -- Friedrich Schiller
    Stop, children, what's that sound? Everybody look what's going down. -- Stephen Stills

  • Thanks Bob. I just expected the pivot solution to require far less coding in the case of 30 column pairs. But even that doesn't seem to be the case, now I take closer look at it.

    Peter

  • Is it too late to add this to the mix...?

    ;WITH

    TypeConvert AS

    (

    -- Convert to sql_variant for the upcoming UNPIVOT

    SELECT rowid,

    column1_old = CONVERT(SQL_VARIANT, column1_old),

    column2_old = CONVERT(SQL_VARIANT, column2_old),

    column3_old = CONVERT(SQL_VARIANT, column3_old),

    column1_new = CONVERT(SQL_VARIANT, column1_new),

    column2_new = CONVERT(SQL_VARIANT, column2_new),

    column3_new = CONVERT(SQL_VARIANT, column3_new)

    FROM @Update

    ),

    Unpivoted AS

    (

    -- Break the column names up into key-value pairs

    SELECT rowid,

    column_nm = LEFT(column_name, 7),

    version_nm = RIGHT(column_name,3) + '_value',

    value

    FROM TypeConvert

    UNPIVOT (

    value

    FOR column_name IN

    (column1_old, column2_old, column3_old, column1_new, column2_new, column3_new)

    ) PointlessAlias

    ),

    Pivoted AS

    (

    -- Un-unpivot in the way we want stuff displayed

    SELECT rowid,

    column_nm,

    old_value,

    new_value

    FROM Unpivoted

    PIVOT (

    MAX(value)

    FOR version_nm IN (old_value, new_value)

    ) PointlessAlias

    )

    -- The output, already

    SELECT rowid,

    column_nm,

    old_value,

    new_value

    FROM Pivoted

    ORDER BY

    rowid, column_nm;

  • Your input is always welcome, Emperor Paulpatine. Your solution is cool. I even like the use of SQL_VARIANT. The execution plan also looks sweet, although I am mystified as to how it gets there from here.

    I have one more variation I want to write up that will use a CTE to pair up the old/new sets, a second CTE to UNPIVOT, and a third CTE to split the pairs. When this is all done, I am going to try and write a little article. This seems like a niche problem, but the variety of solutions is proving to be a lot of fun. 🙂

    __________________________________________________

    Against stupidity the gods themselves contend in vain. -- Friedrich Schiller
    Stop, children, what's that sound? Everybody look what's going down. -- Stephen Stills

  • Cheers Bob - and yes I think an article is a great idea. The solution I posted owes much to your ideas and Peter's, but just does things slightly differently overall. The optimizer does a great job with it, I agree 🙂

  • Bob Hovious 24601 (10/28/2009)


    You're welcome, Wayne. Please let us know which you finally go with and why. If one is significantly faster than the other, I'd like to know. I'm actually hoping that Peter's cross join wins, just because I'm sentimental about old-school techniques getting the job done. Worst case, we have a good technique for SQL2000 and a good technique for SQL25K and beyond.

    Okay, I'm still working on getting some performance specs to compare all three methods. However, I'm leaning strongly towards Peter's method. The reason... the flexibility to put into the "column_nm" column user-friendly text, like "First Name", "Birth Date", etc.

    I'm pretty impressed that all three examples are showing single table scans. The method I had posted was going to do a scan for each column pair, and avoiding that (especially with nearly 30 column pairs) was the reason for this post in the first place.

    Wayne
    Microsoft Certified Master: SQL Server 2008
    Author - SQL Server T-SQL Recipes


    If you can't explain to another person how the code that you're copying from the internet works, then DON'T USE IT on a production system! After all, you will be the one supporting it!
    Links:
    For better assistance in answering your questions
    Performance Problems
    Common date/time routines
    Understanding and Using APPLY Part 1 & Part 2

  • This has been an interesting problem to follow.

    I really like the PointlessAlias as the Table Alias Emperator PaulPatine.

    I agree with Wayne that it is interesting that all three solutions create just about the same execution plan and statistics.

    Good luck on the Article Bob.

    Jason...AKA CirqueDeSQLeil
    _______________________________________________
    I have given a name to my pain...MCM SQL Server, MVP
    SQL RNNR
    Posting Performance Based Questions - Gail Shaw[/url]
    Learn Extended Events

  • Matt Whitfield has written a .NET test harness that maybe could be helpful in choosing the fastest solution.

    You could call him in the thread and ask for some info.

    -- Gianluca Sartori

  • Thanks, Gianluca. I'll ask him about it.

    __________________________________________________

    Against stupidity the gods themselves contend in vain. -- Friedrich Schiller
    Stop, children, what's that sound? Everybody look what's going down. -- Stephen Stills

Viewing 15 posts - 1 through 15 (of 50 total)

You must be logged in to reply to this topic. Login to reply