• Ok... the Billion row results are in. On the orignal code that Russ submitted (looks like modification of Itzek's code),

    Insert Duration - 00:29:00

    Convert column to NOT NULL - 01:01:42

    Add Clustered PK - 01:27:59

    Total - 2:58:41

    First, we'll revert do Itzek's original rendition of the code with is about 5 minutes faster on my humble box... just notice the difference in placement of the ROW_NUMBER...

    ;WITH

    L0 AS (SELECT 1 AS C UNION ALL SELECT 1), --2 rows

    L1 AS (SELECT 1 AS C FROM L0 AS A, L0 AS B),--4 rows

    L2 AS (SELECT 1 AS C FROM L1 AS A, L1 AS B),--16 rows

    L3 AS (SELECT 1 AS C FROM L2 AS A, L2 AS B),--256 rows

    L4 AS (SELECT 1 AS C FROM L3 AS A, L3 AS B),--65536 rows

    L5 AS (SELECT 1 AS C FROM L4 AS A, L4 AS B),--4294967296 rows

    Nums AS (SELECT ROW_NUMBER() OVER (ORDER BY C) AS n FROM L5)

    SELECT N AS i INTO NUMBERS_TEST FROM Nums WHERE n <= 100000000

    Now, lemme show you one of those "undocumented" tricks that makes all the difference in the world... let's add a something that will make the "i" column NOT NULL as it's being built...

    ;WITH

    L0 AS (SELECT 1 AS C UNION ALL SELECT 1), --2 rows

    L1 AS (SELECT 1 AS C FROM L0 AS A, L0 AS B),--4 rows

    L2 AS (SELECT 1 AS C FROM L1 AS A, L1 AS B),--16 rows

    L3 AS (SELECT 1 AS C FROM L2 AS A, L2 AS B),--256 rows

    L4 AS (SELECT 1 AS C FROM L3 AS A, L3 AS B),--65536 rows

    L5 AS (SELECT 1 AS C FROM L4 AS A, L4 AS B),--4294967296 rows

    Nums AS (SELECT ROW_NUMBER() OVER (ORDER BY C) AS n FROM L5)

    SELECT [font="Arial Black"]ISNULL([/font]N[font="Arial Black"],0) [/font]AS i INTO NUMBERS_TEST FROM Nums WHERE n <= 100000000

    Believe it or not, it doesn't slow the code down at all and has the added benefit of making the result column NOT NULL. That wipes out the whole 01:01:42 for converting the column to NOT NULL in a separate step.

    Now, for the big suprise... after the table was built, the dedicated database I made for this test had expanded to 16.5 GB. Now, the database is set to the Simple Recovery Mode, and the "used" portion did drop back down to that vicinity, but the database MDF file expanded to over 30 GB to build the clustered index. In other words, adding the clustered index temporarily caused the disk usage to almost double for the sorts involved.

    I will say that I'm pretty happy that I don't have to create a Billion row numbers table everyday!

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)