• ScottPletcher (10/2/2012)


    SQL Server is optimized to do table lookups, whereas CASE statements are comparatively very slow.

    Therefore, I suggest using a lookup table, as suggested by others. You absolutely want to make the lookup code the unqiue clustering key to the table, to speed up SQL's table search. You can make it an actual PRIMARY KEY also, if you want to, but that's not required.

    That's interesting. I know from experience that CROSSTAB queries using CASE can be accelerated if the data is preaggregated before applying the aggregate across the CASE statements. I never really considered that CASE statements might significantly slow up a straightforward SELECT without aggregation. So here's a quick and dirty test:

    -- crude test of cost of CASE

    DROP TABLE #Temp

    SELECT TOP 200000 rn = ROW_NUMBER() OVER (ORDER BY (SELECT NULL))

    INTO #Temp

    FROM sys.columns a, sys.columns b, sys.columns c

    DROP TABLE #temp2

    DROP TABLE #temp3

    PRINT 'Simple SELECT'

    SET STATISTICS TIME ON

    SELECT rn, rn2 = CAST(rn AS VARCHAR(6))

    INTO #temp2

    FROM #Temp

    SET STATISTICS TIME OFF

    PRINT 'SELECT with 60 CASE alternatives'

    SET STATISTICS TIME ON

    SELECT rn,

    rn2 = CASE rn

    WHEN 1 THEN CAST(rn AS VARCHAR(6))

    WHEN 2 THEN CAST(rn AS VARCHAR(6))

    WHEN 3 THEN CAST(rn AS VARCHAR(6))

    WHEN 4 THEN CAST(rn AS VARCHAR(6))

    WHEN 5 THEN CAST(rn AS VARCHAR(6))

    WHEN 6 THEN CAST(rn AS VARCHAR(6))

    WHEN 7 THEN CAST(rn AS VARCHAR(6))

    WHEN 8 THEN CAST(rn AS VARCHAR(6))

    WHEN 9 THEN CAST(rn AS VARCHAR(6))

    WHEN 10 THEN CAST(rn AS VARCHAR(6))

    WHEN 11 THEN CAST(rn AS VARCHAR(6))

    WHEN 12 THEN CAST(rn AS VARCHAR(6))

    WHEN 13 THEN CAST(rn AS VARCHAR(6))

    WHEN 14 THEN CAST(rn AS VARCHAR(6))

    WHEN 15 THEN CAST(rn AS VARCHAR(6))

    WHEN 16 THEN CAST(rn AS VARCHAR(6))

    WHEN 17 THEN CAST(rn AS VARCHAR(6))

    WHEN 18 THEN CAST(rn AS VARCHAR(6))

    WHEN 19 THEN CAST(rn AS VARCHAR(6))

    WHEN 20 THEN CAST(rn AS VARCHAR(6))

    WHEN 100 THEN CAST(rn AS VARCHAR(6))

    WHEN 200 THEN CAST(rn AS VARCHAR(6))

    WHEN 300 THEN CAST(rn AS VARCHAR(6))

    WHEN 400 THEN CAST(rn AS VARCHAR(6))

    WHEN 500 THEN CAST(rn AS VARCHAR(6))

    WHEN 600 THEN CAST(rn AS VARCHAR(6))

    WHEN 700 THEN CAST(rn AS VARCHAR(6))

    WHEN 800 THEN CAST(rn AS VARCHAR(6))

    WHEN 900 THEN CAST(rn AS VARCHAR(6))

    WHEN 1000 THEN CAST(rn AS VARCHAR(6))

    WHEN 1100 THEN CAST(rn AS VARCHAR(6))

    WHEN 1200 THEN CAST(rn AS VARCHAR(6))

    WHEN 1300 THEN CAST(rn AS VARCHAR(6))

    WHEN 1400 THEN CAST(rn AS VARCHAR(6))

    WHEN 1500 THEN CAST(rn AS VARCHAR(6))

    WHEN 1600 THEN CAST(rn AS VARCHAR(6))

    WHEN 1700 THEN CAST(rn AS VARCHAR(6))

    WHEN 1800 THEN CAST(rn AS VARCHAR(6))

    WHEN 1900 THEN CAST(rn AS VARCHAR(6))

    WHEN 2000 THEN CAST(rn AS VARCHAR(6))

    WHEN 10000 THEN CAST(rn AS VARCHAR(6))

    WHEN 20000 THEN CAST(rn AS VARCHAR(6))

    WHEN 30000 THEN CAST(rn AS VARCHAR(6))

    WHEN 40000 THEN CAST(rn AS VARCHAR(6))

    WHEN 50000 THEN CAST(rn AS VARCHAR(6))

    WHEN 60000 THEN CAST(rn AS VARCHAR(6))

    WHEN 70000 THEN CAST(rn AS VARCHAR(6))

    WHEN 80000 THEN CAST(rn AS VARCHAR(6))

    WHEN 90000 THEN CAST(rn AS VARCHAR(6))

    WHEN 100000 THEN CAST(rn AS VARCHAR(6))

    WHEN 110000 THEN CAST(rn AS VARCHAR(6))

    WHEN 120000 THEN CAST(rn AS VARCHAR(6))

    WHEN 130000 THEN CAST(rn AS VARCHAR(6))

    WHEN 140000 THEN CAST(rn AS VARCHAR(6))

    WHEN 150000 THEN CAST(rn AS VARCHAR(6))

    WHEN 160000 THEN CAST(rn AS VARCHAR(6))

    WHEN 170000 THEN CAST(rn AS VARCHAR(6))

    WHEN 180000 THEN CAST(rn AS VARCHAR(6))

    WHEN 190000 THEN CAST(rn AS VARCHAR(6))

    WHEN 200000 THEN CAST(rn AS VARCHAR(6))

    ELSE

    CAST(rn AS VARCHAR(6))

    END

    INTO #temp3

    FROM #Temp

    SET STATISTICS TIME OFF

    I ran the statements a number of times, returning the results to screen or to #temp table from my local instance. Here are the average values for 10 runs, returning to #temp tables:

    Simple SELECT

    SQL Server Execution Times:

    CPU time = 123.4 ms, elapsed time = 123.2 ms.

    (200000 row(s) affected)

    SELECT with 60 CASE alternatives

    SQL Server Execution Times:

    CPU time = 112.3 ms, elapsed time = 160.0 ms.

    (200000 row(s) affected)

    Adding loads of CASE alternatives doesn't appear to change the CPU time very much at all but appears to have a quite significant effect on the elapsed time – increasing it by about 30%.

    I switched to using startdatetime/enddatetime, like this:

    DECLARE @Startdate DATETIME

    PRINT 'Simple SELECT'

    SET @Startdate = GETDATE()

    --SET STATISTICS TIME ON

    SELECT rn, rn2 = CAST(rn AS VARCHAR(6))

    INTO #temp2

    FROM #Temp

    --SET STATISTICS TIME OFF

    PRINT DATEDIFF(MILLISECOND,@Startdate,GETDATE()) --@MSDuration

    and here are the averaged results from 10 runs:

    Simple SELECT

    (200000 row(s) affected)

    136.3

    SELECT with 60 CASE alternatives

    (200000 row(s) affected)

    159.7

    The difference this time is a little less than 20%. The conclusion I’m going to take home from this is – “you can add quite a few options into a CASE statement before it will significantly affect the run time of your query”. What it doesn’t do is account for the relative cost of each option evaluated, i.e. what happens if the CASE options are computationally much more expensive than casting an INT to a VARCHAR? I think you can guess 😉

    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden