SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 
        
Home       Members    Calendar    Who's On


«««12345»»»

Trying to UnPivot without using multiple selects unioned together Expand / Collapse
Author
Message
Posted Thursday, October 29, 2009 10:42 AM


SSCrazy

SSCrazySSCrazySSCrazySSCrazySSCrazySSCrazySSCrazySSCrazy

Group: General Forum Members
Last Login: Today @ 11:22 AM
Points: 2,495, Visits: 3,109
Hence, my earlier comment

I wonder if, under the covers, an UNPIVOT is really just a cross join like Peter's solution.



__________________________________________________

My evil twin is Barry Young.

Against stupidity the gods themselves contend in vain. -- Friedrich Schiller
Whenever possible, I prefer to eat the rude. -- Hannibal Lecter
Post #810988
Posted Thursday, October 29, 2009 3:11 PM


SSCarpal Tunnel

SSCarpal TunnelSSCarpal TunnelSSCarpal TunnelSSCarpal TunnelSSCarpal TunnelSSCarpal TunnelSSCarpal TunnelSSCarpal TunnelSSCarpal Tunnel

Group: General Forum Members
Last Login: Today @ 11:51 AM
Points: 4,205, Visits: 2,974
Peter's latest is awesome!

If I'm allowed a little 2008-syntax, this produces the same plan, but with even more compact SQL:

select  U.rowid, CA.column_nm, CA.old, CA.new
from @Update U
cross apply
( values
('column1', convert(sql_variant, u.Column1_old), convert(sql_variant, u.Column1_new)),
('column2', u.Column2_old, u.Column2_new),
('column3', u.Column3_old, u.Column3_new)
) CA (column_nm, old, new)
order by U.rowid, CA.column_nm;




SQLCLR: Better than cutting boards with a hammer
(Pedro DeRose)
Post #811239
Posted Thursday, October 29, 2009 4:39 PM
Right there with Babe

Right there with BabeRight there with BabeRight there with BabeRight there with BabeRight there with BabeRight there with BabeRight there with BabeRight there with Babe

Group: General Forum Members
Last Login: Today @ 1:04 PM
Points: 771, Visits: 2,142

select  U.rowid, CA.column_nm, CA.old, CA.new
from @Update U
cross apply
( values
('column1', convert(sql_variant, u.Column1_old), convert(sql_variant, u.Column1_new)),
('column2', u.Column2_old, u.Column2_new),
('column3', u.Column3_old, u.Column3_new)
) CA (column_nm, old, new)
order by U.rowid, CA.column_nm;


Very nice Paul. Never realized you could use the VALUES clause as a derived table in SQL 2008. And I looked up BOL to find anything about the implicit conversions, but I couldn't find anything. Any way, it seems to work that way. Good to know.

Peter
Post #811277
Posted Thursday, October 29, 2009 6:38 PM


SSCrazy

SSCrazySSCrazySSCrazySSCrazySSCrazySSCrazySSCrazySSCrazy

Group: General Forum Members
Last Login: Today @ 11:22 AM
Points: 2,495, Visits: 3,109
Emperor Paulpatine, that is truly elegant to behold.

__________________________________________________

My evil twin is Barry Young.

Against stupidity the gods themselves contend in vain. -- Friedrich Schiller
Whenever possible, I prefer to eat the rude. -- Hannibal Lecter
Post #811314
Posted Thursday, October 29, 2009 6:44 PM


SSCarpal Tunnel

SSCarpal TunnelSSCarpal TunnelSSCarpal TunnelSSCarpal TunnelSSCarpal TunnelSSCarpal TunnelSSCarpal TunnelSSCarpal TunnelSSCarpal Tunnel

Group: General Forum Members
Last Login: Today @ 11:51 AM
Points: 4,205, Visits: 2,974
Peter Brinkhaus (10/29/2009)
...I looked up BOL to find anything about the implicit conversions, but I couldn't find anything. Any way, it seems to work that way. Good to know.

The rules are the same as for UNION - see Guidelines for Using Union:

Books Online
When different data types are combined in a UNION operation, they are converted by using the rules of data type precedence. In the previous example, the int values are converted to float, because float has a higher precedence than int. For more information, see Data Type Precedence (Transact-SQL).

SQL_VARIANT happens to have the second-highest precedence (after UDTs).



SQLCLR: Better than cutting boards with a hammer
(Pedro DeRose)
Post #811315
Posted Thursday, October 29, 2009 6:50 PM


SSCarpal Tunnel

SSCarpal TunnelSSCarpal TunnelSSCarpal TunnelSSCarpal TunnelSSCarpal TunnelSSCarpal TunnelSSCarpal TunnelSSCarpal TunnelSSCarpal Tunnel

Group: General Forum Members
Last Login: Today @ 11:51 AM
Points: 4,205, Visits: 2,974
Bob Hovious 24601 (10/29/2009)
Emperor Paulpatine, that is truly elegant to behold.

Wait til Barry gets hold of it and on-the-fly compresses it down to 20 bytes

http://beyondrelational.com/blogs/tc/archive/2009/06/22/tsql-challenge-7-solution-by-barry-young.aspx



SQLCLR: Better than cutting boards with a hammer
(Pedro DeRose)
Post #811318
Posted Saturday, October 31, 2009 11:08 AM
SSCommitted

SSCommittedSSCommittedSSCommittedSSCommittedSSCommittedSSCommittedSSCommittedSSCommitted

Group: General Forum Members
Last Login: Today @ 12:10 PM
Points: 1,665, Visits: 2,037
Okay, I've finished my performance testing. See attached zip file for the code used, as well as the execution plans generated and the trace file outputs.

Test was:
Create a 1,000,000 row test table, with 30 pairs of columns.
Run Bob's "Hybrid" Cross-Join / Unpivot solution.
Run Peter's Cross-Join.
Run Paul's first solution (UnPivot / RePivot).

The winner, by a landslide, is Peter's solution. It consistently ran in 1/3 the time of the second place, which was Bob's.
Sorry Paul, but your solution was a very distant 3rd. I let it run overnight, and it wasn't finished in the morning. And tempdb had sucked up all remaining space on the drive, and was very busy swapping out with the page file.

I then ran the following tests:
Test 1: rerun test, but stopped Paul's after 30 minutes. See SSCTest1.trc. Tempdb was up to 30gb at this point.

After seeing the effect of Paul's solution on tempdb, I decided to measure what effect the resizing of tempdb had on the performance of the other two routines.
After a clean boot, a "select name, size from tempdb.sys.database_files" has tempDev/tempLog of 1024/64.
After creating the million row #update table, they were 52048/528
After running Bob's Hybrid method, they were 263168/8720.
After running Peter's Cross-Join method, they were 263168/8720 (the same as after Bob's).
see SSCTest2.trc

Seeing that Peter's didn't have to resize tempdb, I then rebooted, and ran the tests again, but running Peter's Cross-Join method before Bobs:
After the reboot, the tempDev/tempLog were 1024/64.
After creating the million row #update table, they were 52048/528
After running Peter's Cross-Join method, they were 217488/7208
After running Bob's Hybrid method, they were 263168/8720.
In this test, Peter's Cross-Join took about 1 minute longer to run, while Bob's Hybrid method took about 30 seconds less to run.
see SSCTest3.trc

In all of these tests, Peter's Cross-Tab solution had about 9x the reads and fewer writes, but it always ran in about 1/3 the time. The percentage of writes varies by whether it was run before or after Bob's Hybrid solution.

Finally, I wanted to get a comparison of Paul's method, and to see it's execution plan. I set the test to 1000 rows, and ran again. Results are in SSCTest4.trc.
Even with this small number of rows, the CPU and Duration was 20x the Cross-Join method; the reads were 1000x, and the writes were 12,000x. Peter's Cross-Join was finished in 237ms, while Bob's Hybrid was finished in 1050ms. The UnPivot/RePivot took 5656ms.
When I tried this again with 10,000 rows, I had to abort it again... it was taking just too long. It clearly does not scale uniformly.

All execution plans show table scans of the #Update table (expected).

So, in summary:
On the million row test, the Hybrid solution ran consistently in just short of 11 minutes; while the pure cross-join method ran in just over 4 minutes.
With only 1,000 rows, Paul's Unpivot/RePivot solution took longer than the Cross-Join on one million rows. It also required extensive tempdb usage.

In one of the earlier posts, there was mention of the time to write the code. They all took about the same. Though for all, I let SQL write a lot of it for me, then cut-and-paste.

Note: the test was performed on my 4gb x64 laptop, w/ SQL 2008. You might not be able to read the trace files if you're not using SQL 2008 Profiler... but you should be able to query them with fn_trace_gettable.

Edit: added attachment (wonder where it went the first time...)
Edit2: I forgot to mention that in all of my tests, I had SSMS set to discard results.


Wayne
For better assistance in answering your questions, click here.
For performance problems, please read this.
For common date/time routines, click here.
For CROSS-TABS and PIVOT tables, click on Part 1 and Part 2.


  Post Attachments 
SSCTest.zip (2 views, 89.63 KB)
Post #812031
Posted Saturday, October 31, 2009 11:17 AM
SSCommitted

SSCommittedSSCommittedSSCommittedSSCommittedSSCommittedSSCommittedSSCommittedSSCommitted

Group: General Forum Members
Last Login: Today @ 12:10 PM
Points: 1,665, Visits: 2,037
Bob Hovious 24601 (10/29/2009)
I wonder if, under the covers, an UNPIVOT is really just a cross join like Peter's solution.
I don't think so... it doesn't perform like one.


Wayne
For better assistance in answering your questions, click here.
For performance problems, please read this.
For common date/time routines, click here.
For CROSS-TABS and PIVOT tables, click on Part 1 and Part 2.
Post #812032
Posted Saturday, October 31, 2009 12:32 PM


SSCrazy

SSCrazySSCrazySSCrazySSCrazySSCrazySSCrazySSCrazySSCrazy

Group: General Forum Members
Last Login: Today @ 11:22 AM
Points: 2,495, Visits: 3,109
No matter, Wayne. I'm just delighted to see Peter's solution win out.

__________________________________________________

My evil twin is Barry Young.

Against stupidity the gods themselves contend in vain. -- Friedrich Schiller
Whenever possible, I prefer to eat the rude. -- Hannibal Lecter
Post #812040
Posted Saturday, October 31, 2009 2:48 PM
Right there with Babe

Right there with BabeRight there with BabeRight there with BabeRight there with BabeRight there with BabeRight there with BabeRight there with BabeRight there with Babe

Group: General Forum Members
Last Login: Today @ 1:04 PM
Points: 771, Visits: 2,142
Wayne, thanks for the extensive test results. However I don't see the attachments in your post. I'm quite curious about the execution plans on a millions input rows, because the execution plans of all the solutions posted in this thread looked nearly identical. I expected only a little difference in CPU time, but not that difference in use of tempdb. Can you post the attachment again. Thanks. Great job.

Peter
Post #812054
« Prev Topic | Next Topic »

«««12345»»»

Permissions Expand / Collapse