RE: The Joy of Numbers – SQLServerCentral

SSC Guru

Points: 1003851

November 23, 2008 at 6:29 am

Ok... the Billion row results are in. On the orignal code that Russ submitted (looks like modification of Itzek's code),

Insert Duration - 00:29:00

Convert column to NOT NULL - 01:01:42

Add Clustered PK - 01:27:59

Total - 2:58:41

First, we'll revert do Itzek's original rendition of the code with is about 5 minutes faster on my humble box... just notice the difference in placement of the ROW_NUMBER...

;WITH

L0 AS (SELECT 1 AS C UNION ALL SELECT 1), --2 rows

L1 AS (SELECT 1 AS C FROM L0 AS A, L0 AS B),--4 rows

L2 AS (SELECT 1 AS C FROM L1 AS A, L1 AS B),--16 rows

L3 AS (SELECT 1 AS C FROM L2 AS A, L2 AS B),--256 rows

L4 AS (SELECT 1 AS C FROM L3 AS A, L3 AS B),--65536 rows

L5 AS (SELECT 1 AS C FROM L4 AS A, L4 AS B),--4294967296 rows

Nums AS (SELECT ROW_NUMBER() OVER (ORDER BY C) AS n FROM L5)

SELECT N AS i INTO NUMBERS_TEST FROM Nums WHERE n <= 100000000

Now, lemme show you one of those "undocumented" tricks that makes all the difference in the world... let's add a something that will make the "i" column NOT NULL as it's being built...

;WITH

L0 AS (SELECT 1 AS C UNION ALL SELECT 1), --2 rows

L1 AS (SELECT 1 AS C FROM L0 AS A, L0 AS B),--4 rows

L2 AS (SELECT 1 AS C FROM L1 AS A, L1 AS B),--16 rows

L3 AS (SELECT 1 AS C FROM L2 AS A, L2 AS B),--256 rows

L4 AS (SELECT 1 AS C FROM L3 AS A, L3 AS B),--65536 rows

L5 AS (SELECT 1 AS C FROM L4 AS A, L4 AS B),--4294967296 rows

Nums AS (SELECT ROW_NUMBER() OVER (ORDER BY C) AS n FROM L5)

SELECT [font="Arial Black"]ISNULL([/font]N[font="Arial Black"],0) [/font]AS i INTO NUMBERS_TEST FROM Nums WHERE n <= 100000000

Believe it or not, it doesn't slow the code down at all and has the added benefit of making the result column NOT NULL. That wipes out the whole 01:01:42 for converting the column to NOT NULL in a separate step.

Now, for the big suprise... after the table was built, the dedicated database I made for this test had expanded to 16.5 GB. Now, the database is set to the Simple Recovery Mode, and the "used" portion did drop back down to that vicinity, but the database MDF file expanded to over 30 GB to build the clustered index. In other words, adding the clustered index temporarily caused the disk usage to almost double for the sorts involved.

I will say that I'm pretty happy that I don't have to create a Billion row numbers table everyday!

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)