FizzBuzz

  • By the way... I took a look at most of the code in the contest you have going on... all you folks that are converting the number to VARCHAR() 6, 7, 8, etc... I'm thinking that your code may not actually be creating the correct output when someone runs the code for 150,000,000 which contains 9 digits... 😉

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • CirquedeSQLeil (2/24/2010)


    WayneS (2/24/2010)


    FYI, Gianluca's unpivot works the fastest on my system (813ms), though Lynn's is right behind (833ms) - initial runs. But, when run repeatedly (and randomly), they actually flip-flop back and forth, but both stay consistently ~ 850ms. Both codes from that posted at ask...

    What is your test method?

    Nothing fancy... or noteworthy. Just copy each from ask.ssc into a different query window, and run repeatedly.

    Wayne
    Microsoft Certified Master: SQL Server 2008
    Author - SQL Server T-SQL Recipes


    If you can't explain to another person how the code that you're copying from the internet works, then DON'T USE IT on a production system! After all, you will be the one supporting it!
    Links:
    For better assistance in answering your questions
    Performance Problems
    Common date/time routines
    Understanding and Using APPLY Part 1 & Part 2

  • reidres (2/24/2010)


    ...Gianluca's unpivot works for 1 000 000 rows only, no more or less. So it is not scalable nor flexible--and who only wants a million rows?

    That would be because he only does enough cross-joins in the millionrows CTE to get it to one million... add another cross-join of the thousandrows CTE, and it will blossom to 1 billion.

    Wayne
    Microsoft Certified Master: SQL Server 2008
    Author - SQL Server T-SQL Recipes


    If you can't explain to another person how the code that you're copying from the internet works, then DON'T USE IT on a production system! After all, you will be the one supporting it!
    Links:
    For better assistance in answering your questions
    Performance Problems
    Common date/time routines
    Understanding and Using APPLY Part 1 & Part 2

  • WayneS (2/25/2010)


    CirquedeSQLeil (2/24/2010)


    WayneS (2/24/2010)


    FYI, Gianluca's unpivot works the fastest on my system (813ms), though Lynn's is right behind (833ms) - initial runs. But, when run repeatedly (and randomly), they actually flip-flop back and forth, but both stay consistently ~ 850ms. Both codes from that posted at ask...

    What is your test method?

    Nothing fancy... or noteworthy. Just copy each from ask.ssc into a different query window, and run repeatedly.

    K. I think you are comparing apples to oranges there. Did you modify all to make sure they were outputting or simply dumping into a temp table, or just return last value?

    Jason...AKA CirqueDeSQLeil
    _______________________________________________
    I have given a name to my pain...MCM SQL Server, MVP
    SQL RNNR
    Posting Performance Based Questions - Gail Shaw[/url]
    Learn Extended Events

  • Lucky I went for varchar(255) then Jeff.

    That should cover it!

    Cheers

    Steve

  • I can cut somewhere between 0.2 and 4 per cent off the elapsed time at 150000000 rows for Jason's method (but limiting scalability to 4 billion rows instead of 1000 billion) by using a large enough base cte that 5 cross joins are needed instead of the large number needed with a 10 row base cte. At least that's on my machine the difference in run speed between the larger base cte and the 10 row one varies between 0.2% and 3.8%, but we all know this stuff is environment-sensitive and there may be no difference at all on another machine. Since the 6th root of 2**32 is a about 40.3, this requires a 41 row base cte if 2**32 is where I want to scale to.

    Here's the code (it's clearly a tuned version of Jason's code, not anything new)

    declare @limit bigint set @Limit = 150000000 -- @Limit can be anything from 1 to 42180533641

    set statistics time on

    ;with base(n) as

    (select 1 union all select 2 union all select 3 union all select 4 union all select 5 union all

    select 6 union all select 7 union all select 8 union all select 9 union all select 10 union all

    select 11 union all select 12 union all select 12 union all select 14 union all select 15 union all

    select 16 union all select 17 union all select 18 union all select 19 union all select 20 union all

    select 21 union all select 22 union all select 22 union all select 24 union all select 25 union all

    select 26 union all select 27 union all select 28 union all select 29 union all select 30 union all

    select 31 union all select 32 union all select 32 union all select 34 union all select 35 union all

    select 36 union all select 37 union all select 38 union all select 39 union all select 40 union all

    select 41 ),

    bigun(n) as

    (select 1 from

    base b1

    cross join base b2

    cross join base b3

    cross join base b4

    cross join base b5

    cross join base b6 )

    SELECT count(all case

    when n % 15 = 0 then 'FizzzBuzz' --Multiples of 3 AND 5

    when n % 3 = 0 then 'Fizz'

    when n % 5 = 0 then 'Buzz'

    else cast(n as VarChar(10))

    end)

    FROM ( SELECT ROW_NUMBER() OVER (ORDER BY n)

    FROM bigun ) D (n)

    WHERE n <= abs(@Limit) ;

    set statistics time off;

    go

    Restricting the scalability to the range of a positive (int) would lead to a 36 row base CTE, but doesn't have any useful effect on the speed (in fact it seems to have a bad effect: it always runs slightly slower with 36 base elements than with 41, which I don't unerstand because with 41 all 5 joins are needed to get to 150000000, just as they are with 36 elements. I'm going to look at the actual query plans - predicted ones won't be useful - but don't reasslly expect to discover what's going on).

    Tom

  • Sure beats this more conventional CTE approach by about a factor of 4.

    CAVEAT

    Don't use this one. There are better ones out there!

    set statistics time on

    ;

    with

    cte1 as

    (

    select top 150000000 row_number() over (order by (select 0)) as row from master.sys.all_columns

    cross join master.sys.all_columns AS T2

    cross join master.sys.all_columns AS T3

    )

    SELECT count(all case

    when row%15 = 0 then 'FIZZBUZZ'

    when row%5 = 0 then 'BUZZ'

    when row%3 = 0 then 'FIZZ'

    else cast(row as varchar(9))

    end)

    from cte1

    set statistics time off

  • However if I rewrite my previous post using the WHERE clause instead of TOP, it makes a massive difference and is very, very close to Tom's.

    set statistics time on

    ;

    with

    cte1 as

    (

    select row_number() over (order by (select 0)) as row from master.sys.all_columns

    cross join master.sys.all_columns AS T2

    cross join master.sys.all_columns AS T3

    )

    SELECT count(all case

    when row%15 = 0 then 'FIZZBUZZ'

    when row%5 = 0 then 'BUZZ'

    when row%3 = 0 then 'FIZZ'

    else cast(row as varchar(9))

    end)

    from cte1 where row < 150000001

    set statistics time off

  • Tom.Thomson (2/26/2010)


    I can cut somewhere between 0.2 and 4 per cent off the elapsed time at 150000000 rows for Jason's method (but limiting scalability to 4 billion rows instead of 1000 billion) by using a large enough base cte that 5 cross joins are needed instead of the large number needed with a 10 row base cte. At least that's on my machine the difference in run speed between the larger base cte and the 10 row one varies between 0.2% and 3.8%, but we all know this stuff is environment-sensitive and there may be no difference at all on another machine. Since the 6th root of 2**32 is a about 40.3, this requires a 41 row base cte if 2**32 is where I want to scale to.

    Here's the code (it's clearly a tuned version of Jason's code, not anything new)

    declare @limit bigint set @Limit = 150000000 -- @Limit can be anything from 1 to 42180533641

    set statistics time on

    ;with base(n) as

    (select 1 union all select 2 union all select 3 union all select 4 union all select 5 union all

    select 6 union all select 7 union all select 8 union all select 9 union all select 10 union all

    select 11 union all select 12 union all select 12 union all select 14 union all select 15 union all

    select 16 union all select 17 union all select 18 union all select 19 union all select 20 union all

    select 21 union all select 22 union all select 22 union all select 24 union all select 25 union all

    select 26 union all select 27 union all select 28 union all select 29 union all select 30 union all

    select 31 union all select 32 union all select 32 union all select 34 union all select 35 union all

    select 36 union all select 37 union all select 38 union all select 39 union all select 40 union all

    select 41 ),

    bigun(n) as

    (select 1 from

    base b1

    cross join base b2

    cross join base b3

    cross join base b4

    cross join base b5

    cross join base b6 )

    SELECT count(all case

    when n % 15 = 0 then 'FizzzBuzz' --Multiples of 3 AND 5

    when n % 3 = 0 then 'Fizz'

    when n % 5 = 0 then 'Buzz'

    else cast(n as VarChar(10))

    end)

    FROM ( SELECT ROW_NUMBER() OVER (ORDER BY n)

    FROM bigun ) D (n)

    WHERE n <= abs(@Limit) ;

    set statistics time off;

    go

    Restricting the scalability to the range of a positive (int) would lead to a 36 row base CTE, but doesn't have any useful effect on the speed (in fact it seems to have a bad effect: it always runs slightly slower with 36 base elements than with 41, which I don't unerstand because with 41 all 5 joins are needed to get to 150000000, just as they are with 36 elements. I'm going to look at the actual query plans - predicted ones won't be useful - but don't reasslly expect to discover what's going on).

    Thanks for doing that. I will test it out and see how it works. Do you mind if I use this code for an article (blog or otherwise) if my testing shows the improvements?

    Jason...AKA CirqueDeSQLeil
    _______________________________________________
    I have given a name to my pain...MCM SQL Server, MVP
    SQL RNNR
    Posting Performance Based Questions - Gail Shaw[/url]
    Learn Extended Events

  • CirquedeSQLeil (2/26/2010)


    Thanks for doing that. I will test it out and see how it works. Do you mind if I use this code for an article (blog or otherwise) if my testing shows the improvements?

    Hi Jason, I reckon you did all the spade work on this and I've just played about at the edges, as it were hangingon to your coat tails. So as far as I'm concerned it's all yours to do as you wish with.

    Tom

  • Jason... you might also want to try Gianluca's enhancement with the original "seed" using UNPIVOT. Between what he came up with, your original work, and Tom's enhancement, you should end up with the fastest code on the block.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff Moden (2/26/2010)


    Jason... you might also want to try Gianluca's enhancement with the original "seed" using UNPIVOT. Between what he came up with, your original work, and Tom's enhancement, you should end up with the fastest code on the block.

    I like the suggestion. I had tested Gianluca's code, but hadn't delved into it.

    Jason...AKA CirqueDeSQLeil
    _______________________________________________
    I have given a name to my pain...MCM SQL Server, MVP
    SQL RNNR
    Posting Performance Based Questions - Gail Shaw[/url]
    Learn Extended Events

  • Tom.Thomson (2/26/2010)


    CirquedeSQLeil (2/26/2010)


    Thanks for doing that. I will test it out and see how it works. Do you mind if I use this code for an article (blog or otherwise) if my testing shows the improvements?

    Hi Jason, I reckon you did all the spade work on this and I've just played about at the edges, as it were hangingon to your coat tails. So as far as I'm concerned it's all yours to do as you wish with.

    Thanks - you will get credit for your work.

    Jason...AKA CirqueDeSQLeil
    _______________________________________________
    I have given a name to my pain...MCM SQL Server, MVP
    SQL RNNR
    Posting Performance Based Questions - Gail Shaw[/url]
    Learn Extended Events

  • In the meantime is this a possibility? This variation also sets the threshold

    to 4750104241 (1681 x 1681 x 1681), the same as 41^6.

    set statistics time on

    ;

    with

    cte1 as

    (select TOP 1681 1 as col from master..spt_values)

    ,

    cte2 as

    (select row_number() over (order by (select 0)) as row from cte1

    cross join cte1 AS T2

    cross join cte1 AS T3)

    SELECT count(all case

    when row%15 = 0 then 'FIZZBUZZ'

    when row%5 = 0 then 'BUZZ'

    when row%3 = 0 then 'FIZZ'

    else cast(row as varchar(9))

    end)

    from cte2 where row < 150000001

    set statistics time off

  • steve-893342 (2/28/2010)


    In the meantime is this a possibility? This variation also sets the threshold

    to 4750104241 (1681 x 1681 x 1681), the same as 41^6.

    set statistics time on

    ;

    with

    cte1 as

    (select TOP 1681 1 as col from master..spt_values)

    ,

    cte2 as

    (select row_number() over (order by (select 0)) as row from cte1

    cross join cte1 AS T2

    cross join cte1 AS T3)

    SELECT count(all case

    when row%15 = 0 then 'FIZZBUZZ'

    when row%5 = 0 then 'BUZZ'

    when row%3 = 0 then 'FIZZ'

    else cast(row as varchar(9))

    end)

    from cte2 where row < 150000001

    set statistics time off

    It's absolutely a possibility. Just understand that since you're reading from a table, this will have a lot more reads than the pure memory solution that others have offered. I've not tested physical cross-joins in a CTE but I do know that simple cross-joins on a physical table to make very large numbers of rows can cause explosive growth of log files even when the database is in the SIMPLE recovery mode. When I tested the use of more than 1 cross-join to make a billion row return several months ago, one of the log files (I forget which one) exploded to over 40GB. Doing the same test with the memory only solutions caused zero growth.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 15 posts - 211 through 225 (of 363 total)

You must be logged in to reply to this topic. Login to reply