ScottPletcher (10/2/2012)
SQL Server is optimized to do table lookups, whereas CASE statements are comparatively very slow.Therefore, I suggest using a lookup table, as suggested by others. You absolutely want to make the lookup code the unqiue clustering key to the table, to speed up SQL's table search. You can make it an actual PRIMARY KEY also, if you want to, but that's not required.
That's interesting. I know from experience that CROSSTAB queries using CASE can be accelerated if the data is preaggregated before applying the aggregate across the CASE statements. I never really considered that CASE statements might significantly slow up a straightforward SELECT without aggregation. So here's a quick and dirty test:
-- crude test of cost of CASE
DROP TABLE #Temp
SELECT TOP 200000 rn = ROW_NUMBER() OVER (ORDER BY (SELECT NULL))
INTO #Temp
FROM sys.columns a, sys.columns b, sys.columns c
DROP TABLE #temp2
DROP TABLE #temp3
PRINT 'Simple SELECT'
SET STATISTICS TIME ON
SELECT rn, rn2 = CAST(rn AS VARCHAR(6))
INTO #temp2
FROM #Temp
SET STATISTICS TIME OFF
PRINT 'SELECT with 60 CASE alternatives'
SET STATISTICS TIME ON
SELECT rn,
rn2 = CASE rn
WHEN 1 THEN CAST(rn AS VARCHAR(6))
WHEN 2 THEN CAST(rn AS VARCHAR(6))
WHEN 3 THEN CAST(rn AS VARCHAR(6))
WHEN 4 THEN CAST(rn AS VARCHAR(6))
WHEN 5 THEN CAST(rn AS VARCHAR(6))
WHEN 6 THEN CAST(rn AS VARCHAR(6))
WHEN 7 THEN CAST(rn AS VARCHAR(6))
WHEN 8 THEN CAST(rn AS VARCHAR(6))
WHEN 9 THEN CAST(rn AS VARCHAR(6))
WHEN 10 THEN CAST(rn AS VARCHAR(6))
WHEN 11 THEN CAST(rn AS VARCHAR(6))
WHEN 12 THEN CAST(rn AS VARCHAR(6))
WHEN 13 THEN CAST(rn AS VARCHAR(6))
WHEN 14 THEN CAST(rn AS VARCHAR(6))
WHEN 15 THEN CAST(rn AS VARCHAR(6))
WHEN 16 THEN CAST(rn AS VARCHAR(6))
WHEN 17 THEN CAST(rn AS VARCHAR(6))
WHEN 18 THEN CAST(rn AS VARCHAR(6))
WHEN 19 THEN CAST(rn AS VARCHAR(6))
WHEN 20 THEN CAST(rn AS VARCHAR(6))
WHEN 100 THEN CAST(rn AS VARCHAR(6))
WHEN 200 THEN CAST(rn AS VARCHAR(6))
WHEN 300 THEN CAST(rn AS VARCHAR(6))
WHEN 400 THEN CAST(rn AS VARCHAR(6))
WHEN 500 THEN CAST(rn AS VARCHAR(6))
WHEN 600 THEN CAST(rn AS VARCHAR(6))
WHEN 700 THEN CAST(rn AS VARCHAR(6))
WHEN 800 THEN CAST(rn AS VARCHAR(6))
WHEN 900 THEN CAST(rn AS VARCHAR(6))
WHEN 1000 THEN CAST(rn AS VARCHAR(6))
WHEN 1100 THEN CAST(rn AS VARCHAR(6))
WHEN 1200 THEN CAST(rn AS VARCHAR(6))
WHEN 1300 THEN CAST(rn AS VARCHAR(6))
WHEN 1400 THEN CAST(rn AS VARCHAR(6))
WHEN 1500 THEN CAST(rn AS VARCHAR(6))
WHEN 1600 THEN CAST(rn AS VARCHAR(6))
WHEN 1700 THEN CAST(rn AS VARCHAR(6))
WHEN 1800 THEN CAST(rn AS VARCHAR(6))
WHEN 1900 THEN CAST(rn AS VARCHAR(6))
WHEN 2000 THEN CAST(rn AS VARCHAR(6))
WHEN 10000 THEN CAST(rn AS VARCHAR(6))
WHEN 20000 THEN CAST(rn AS VARCHAR(6))
WHEN 30000 THEN CAST(rn AS VARCHAR(6))
WHEN 40000 THEN CAST(rn AS VARCHAR(6))
WHEN 50000 THEN CAST(rn AS VARCHAR(6))
WHEN 60000 THEN CAST(rn AS VARCHAR(6))
WHEN 70000 THEN CAST(rn AS VARCHAR(6))
WHEN 80000 THEN CAST(rn AS VARCHAR(6))
WHEN 90000 THEN CAST(rn AS VARCHAR(6))
WHEN 100000 THEN CAST(rn AS VARCHAR(6))
WHEN 110000 THEN CAST(rn AS VARCHAR(6))
WHEN 120000 THEN CAST(rn AS VARCHAR(6))
WHEN 130000 THEN CAST(rn AS VARCHAR(6))
WHEN 140000 THEN CAST(rn AS VARCHAR(6))
WHEN 150000 THEN CAST(rn AS VARCHAR(6))
WHEN 160000 THEN CAST(rn AS VARCHAR(6))
WHEN 170000 THEN CAST(rn AS VARCHAR(6))
WHEN 180000 THEN CAST(rn AS VARCHAR(6))
WHEN 190000 THEN CAST(rn AS VARCHAR(6))
WHEN 200000 THEN CAST(rn AS VARCHAR(6))
ELSE
CAST(rn AS VARCHAR(6))
END
INTO #temp3
FROM #Temp
SET STATISTICS TIME OFF
I ran the statements a number of times, returning the results to screen or to #temp table from my local instance. Here are the average values for 10 runs, returning to #temp tables:
Simple SELECT
SQL Server Execution Times:
CPU time = 123.4 ms, elapsed time = 123.2 ms.
(200000 row(s) affected)
SELECT with 60 CASE alternatives
SQL Server Execution Times:
CPU time = 112.3 ms, elapsed time = 160.0 ms.
(200000 row(s) affected)
Adding loads of CASE alternatives doesn't appear to change the CPU time very much at all but appears to have a quite significant effect on the elapsed time – increasing it by about 30%.
I switched to using startdatetime/enddatetime, like this:
DECLARE @Startdate DATETIME
PRINT 'Simple SELECT'
SET @Startdate = GETDATE()
--SET STATISTICS TIME ON
SELECT rn, rn2 = CAST(rn AS VARCHAR(6))
INTO #temp2
FROM #Temp
--SET STATISTICS TIME OFF
PRINT DATEDIFF(MILLISECOND,@Startdate,GETDATE()) --@MSDuration
and here are the averaged results from 10 runs:
Simple SELECT
(200000 row(s) affected)
136.3
SELECT with 60 CASE alternatives
(200000 row(s) affected)
159.7
The difference this time is a little less than 20%. The conclusion I’m going to take home from this is – “you can add quite a few options into a CASE statement before it will significantly affect the run time of your query”. What it doesn’t do is account for the relative cost of each option evaluated, i.e. what happens if the CASE options are computationally much more expensive than casting an INT to a VARCHAR? I think you can guess 😉
For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden