Very nicely done, Jacob.
Please... not trying to take anything away from this great article... just sharing a different method...
As a slightly easier to remember (and, a bit faster, too) approach to generating numbers, check out the following (comparison between Itzek's and a method that I and several others use)...
SET STATISTICS TIME ON
GO
DECLARE @Bitbucket INT
--=============================================================================
PRINT REPLICATE('=',100)
PRINT 'Itzek''s method:'
;WITH
L0 AS (SELECT 1 AS C UNION ALL SELECT 1), --2 rows
L1 AS (SELECT 1 AS C FROM L0 AS A, L0 AS B),--4 rows
L2 AS (SELECT 1 AS C FROM L1 AS A, L1 AS B),--16 rows
L3 AS (SELECT 1 AS C FROM L2 AS A, L2 AS B),--256 rows
L4 AS (SELECT 1 AS C FROM L3 AS A, L3 AS B),--65536 rows
L5 AS (SELECT 1 AS C FROM L4 AS A, L4 AS B),--4294967296 rows
num AS (SELECT ROW_NUMBER() OVER(ORDER BY C) AS N FROM L5)
SELECT @Bitbucket = N FROM NUM WHERE N <= 1000000;
--=============================================================================
PRINT REPLICATE('=',100)
PRINT 'Jeff Moden''s Method'
; WITH cTally AS
(-----------------------------------------------------------------------------
--==== High performance CTE equivalent of a Tally or Numbers table
SELECT TOP (1000000)
ROW_NUMBER() OVER (ORDER BY t1.ID) AS N
FROM Master.sys.SysColumns t1
CROSS JOIN Master.sys.SysColumns t2
)-----------------------------------------------------------------------------
SELECT @Bitbucket = N FROM cTally --Do your outer join with table being checked here
PRINT REPLICATE('=',100)
True, Itzeks's will generate more than what an INT can handle, but, how often do you think you're gonna need to generate more than 121 million numbers? 😉
Here's the same thing as a programmable function...
CREATE FUNCTION dbo.fnTally
/****************************************************************************************
Purpose:
Given a range of Integers not exceeding a count of 121 million, return the range of
numbers as a table.
Notes: Preserved as an "inline" single statement function for sheer performance.
Therefore, no error checking, etc.
Revision History:
Rev 00 - 23 Dec 2005 - Jeff Moden - Initial creation and unit test
****************************************************************************************/
--===== Declare the parameters
(
@piStartNumber INT,
@piEndNumber INT
)
RETURNS TABLE
AS
RETURN (WITH cTally AS
(--------------------------------------------------------------------------------
--==== High performance CTE equivalent of a Tally or Numbers table
SELECT TOP (@piEndNumber-@piStartNumber+1)
ROW_NUMBER() OVER (ORDER BY t1.ID) AS Number
FROM Master.sys.SysColumns t1
CROSS JOIN Master.sys.SysColumns t2
)--------------------------------------------------------------------------------
SELECT N = Number+@piStartNumber-1 FROM cTally
)
Still, a small (11k rows - 30 years of dates) permanent Tally table with a Clustered Index on N will usually beat calculated table functions once the table is cached.
Again, I say, nice article, Jacob! Do it again!
--Jeff Moden
Change is inevitable... Change for the better is not.