• (This is going to sound like a bit of a rant... and it probably is.   Sorry... )

    Although Grant mentioned it in the end, there's a bit of irony in the article in that I wouldn't hire anyone who wrote such code as that shown in the article (simple as it was). 

    Folks, getting the correct answer isn't enough... you have to do it with performance and scalability in mind and you must keep both in mind all the time.  Add code readability to that, as well.

    Let's bump up the stakes a bit... lets use 1,000,000 rows for the same code...

    First, Grant's code doesn't work, as is... it dies at 1,000 so it's NOT SCALABLE.  He wouldn't make it past the first round because he didn't write the code to be scalable despite what the requirements said (100 rows).  Here's the error his code produces...

    Server: Msg 245, Level 16, State 1, Line 7

    Syntax error converting the varchar value '*' to a column of data type int.

    Det's first attempt has a second fault that Grant missed altogether... it has a GO in it that just isn't required and causes the code to fail if you declare any timing variables before it.  He also should have done a SET NOCOUNT ON to keep from printing a million (1 row(s) affected) messages which really slowed his code down.

    Det's second attempt isn't even worth mentioning because it would produce a million single row result sets, as Grant stated.

    Let's look at a couple of other things... Scott made the mistake of formatting the output.  For the most part, formatting the output in SQL is a Bozo-No-No.  That type of formatting should be done in the app, if there is one.

    Except for Det's first attempt, everyone took it at face value that the output should be printed instead of properly returned as a single usable result set.  The requirements never said Print anywhere... it said Count.  Ya gotta think about being able to use the output from any query as a single usable result set all the time.

    John Chapman has the right idea but it'll take a bit more code to reach a million rows.  At least he tried a setbased solution! (Michael Valentine Jones has a dandy function that operates similar to John's performance enabled code but I can't find the URL for it).

    Serqiy certainly has the right idea, but again, no scalability for the future.

    I can't test the CTE solution because I don't have 2k5, but I'll just bet it's fast.

    And... everyone failed the "test" because no one included any documentation!

    After repairing Grant's code, here's the results from the million row tests (not truly scalable but certainly more than 100 rows )...

    Grant's code took        203.626 seconds and pegged my single CPU the whole time.

    Chris' code took          193.640 seconds and pegged my single CPU the whole time.

    Det's first snippet took 286.156 seconds and pegged my single CPU the whole time.

    Scott's code took        285.296 seconds and pegged my single CPU the whole time.

    Those are some pretty hefty times for only a million rows.  So... how would I do this?  Like this, interview or not... would be even faster with MVJ's function...

    /*************************************************************

     Test code requirements:

     1. Count from 1 to 100. (modified to use 1 million rows)

     2. For each number evenly divisible by 3, substitute 'Bizz'.

     3. For each number evenly divisible by 5, substitute 'Buzz'.

     4. For each number divisible by both substitute 'BizzBuzz'

     Revision History:

     Rev 00 - 05/28/2007 - Jeff Moden - Initial creation/unit test.

                         - Deviation: Added duration counter to

                           measure run duration.

     Rev 01 - 05/28/2007 - Jeff Moden - Modified to use 1 million

                           rows for performance/scalability test.

    *************************************************************/

    --===== Declare and start the duration timer

    DECLARE @StartTime DATETIME

        SET @StartTime = GETDATE()

    --===== Declare and set a variable to control the number of

         -- rows created.

    DECLARE @DesiredRowCount INT

        SET @DesiredRowCount = 1000000

    --===== Disable the auto-display of rowcounts for appearance

         -- and speed

        SET NOCOUNT ON

    --===== If the scratchpad table already exists, drop it

         -- Note this is just so you can rerun the test.

         IF OBJECT_ID('TempDB..#Nums','U') IS NOT NULL

            DROP TABLE #Nums

    --===== Limit the number of rows to be built

        SET ROWCOUNT @DesiredRowCount

    --===== Create and populate the table on the fly

      SELECT IDENTITY(INT,1,1) AS RowNum,

             CAST(NULL AS VARCHAR(10)) AS DesiredResult

       INTO #Nums

       FROM Master.dbo.SysColumns sc1 WITH (NOLOCK),

            Master.dbo.SysColumns sc2 WITH (NOLOCK)

    --===== Restore the number of rows to return to normal

         -- Note: Can replace with TOP @variable in SQL 2k5

        SET ROWCOUNT 0

    --===== Produce the desired results according to the

         -- requirements in the code header.  Avoids concatenation which

         -- would cause the code to run a second slower...

     SELECT CASE

                 WHEN RowNum % 15 = 0 THEN 'BizzBuzz' --Divisible by 15

                 WHEN RowNum %  3 = 0 THEN 'Bizz'     --Divisible by 3

                 WHEN RowNum %  5 = 0 THEN 'Buzz'     --Divisible by 5

                 ELSE CAST(RowNum AS VARCHAR(10))

            END AS DesiredResult

       FROM #Nums

      ORDER BY RowNum

    --===== Display the run duration (not part of result set)

      PRINT DATEDIFF(ms,@StartTime,GETDATE())

    ... the total run time is 8.610 seconds (NOT a type-o... less than 9 seconds and a good 20 times (2000%) faster than the fastest loop code) and, yes, it pegged my single CPU... but what would you rather have... a pegged CPU for 3+ minutes or 9 seconds?  Speed is important and RBAR (pronounced "ree-bar" and is a "Modenism" for "Row By Agonizing Row") in the form of single row loops and cursors should be avoided at all cost.  There are a small handful of exceptions for loops in functions, but for the most part, they are an evil thing.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)