### What is the most efficient method of generating a range of values

A lead-developer friend asked me to review a T-SQL code they had inherited which generated a number range between 2 integer values.

He believed that the T-SQL was a prime example of spaghetti code and wanted to see some alternatives. Of course, first port of call was SQLServerCentral to see what alternatives might exist and sure enough I found two scripts.

Chris Cubley's solution was closest to the one I was given to review where as DavidT's script would probably be the one that most procedural programmers would come up with.

I paraphrased all three versions into functions and measured the performance as each of them produced the number range 5 to 567720.

Author | Function | Average execution time |
---|---|---|

Chris Cubley | dbo.fnCrossJoinRange | 30 secs. |

DavidT | dbo.fnRange | 22 secs. |

?????? | dbo.fnUnionRange | 12 secs. |

### Function based on DavidT's script

See Generate Number Range 2 for the original script.

CREATE FUNCTION dbo.fnRange ( @first int , --##PARAM @first The lowest value in the range. @last int --##PARAM @last The highest value in the range. ) RETURNS @values TABLE ( value int primary key ) AS BEGIN DECLARE @temp int -- If the values have been supplied transposed then swap them to the correct way around. IF @first > @last BEGIN SET @temp = @first SET @first = @last SET @last = @temp END WHILE @first<=@last BEGIN INSERT INTO @values VALUES(@first) SET @first = @first+1 END RETURN END GO

The above function is straight forward as it simply loops between the first and last value inserting records into a table variable that is then returned.

This function has three main advantages

- It is simple.
- Values are inserted in order.
- For small number ranges it is very efficient
- The range is limited only by the bounds of the SQL INT datatype. This can be extended by using BIGINT

Its disadvantage is that as number ranges get progressively larger its execution time increases. The reality is that execution times become excessive once the range gets into 7 figures.

### Chris Cubley's version

See Generate Number Range for the original script.

In Chris's original he used a temporary table to hold the values 0 to 9. As this is a static dataset I decided to use a view instead.

CREATE VIEW dbo.Digits AS select 0 as value union all select 1 as value union all select 2 as value union all select 3 as value union all select 4 as value union all select 5 as value union all select 6 as value union all select 7 as value union all select 8 as value union all select 9 as value GO --##SUMMARY This function provides a table of integers between the two specified arguments. --##REMARKS The procedure allows for the two arguments being transposed. CREATE FUNCTION dbo.fnCrossJoinRange ( @first int , --##PARAM @first The lowest value in the range. @last int --##PARAM @last The highest value in the range. ) RETURNS @values TABLE ( value int primary key ) AS BEGIN INSERT INTO @values(value) SELECT num = units.value + (tens.value * 10) + (hundreds.value * 100) + (Thousands.value * 1000) + (TenThousands.value * 10000) + (CThousands.value * 100000) + (Millions.value * 1000000) FROM dbo.Digits units CROSS JOIN dbo.Digits tens CROSS JOIN dbo.Digits hundreds CROSS JOIN dbo.Digits Thousands CROSS JOIN dbo.Digits TenThousands CROSS JOIN dbo.Digits CThousands CROSS JOIN dbo.Digits Millions where units.value + (tens.value * 10) + (hundreds.value * 100) + (Thousands.value * 1000) + (TenThousands.value * 10000) + (CThousands.value * 100000) + (Millions.value * 1000000) BETWEEN @first and @last RETURN END GO

The idea behind this function is that instead of doing a row by row insert we are playing to the strengths of SQL Server and are using set based operations. The theory behind this is that we cross join our number range (0 to 9) to itself repeatedly.

- units contain the values 0 to 9
- tens contain the values 10 to 90 in steps of 10 and zero
- hundreds contain the values 100 to 900 in steps of 100 and zero
- ...etc

Although this function was by far the slowest its lack of performance is principally due to the sheer volume of records it generates before applying a filter. If you genuinely need to generate a range of numbers in the millions then it will actually out-perform the dbo.fnRange function.

### Unknown author's version

The function shown below combines the technique that Chris used in his script with a filter to limit the number of records that are combined.

Like Chris's version it is effectively doing 70 SELECT statements

- 0 to 9
- <10 to 90 in steps of 10 and zero
- <100 to 900 in steps of 100 and zero
- <1000 to 9000 in steps of 1000 and zero
- <10000 to 90000 in steps of 10000 and zero
- <100000 to 900000 in steps of 100000 and zero
- <1000000 to 9000000 in steps of 1000000 and zero

CREATE FUNCTION dbo.fnUnionRange ( @first int , --##PARAM @first The lowest value in the range. @last int --##PARAM @last The highest value in the range. ) RETURNS @values TABLE ( value int primary key ) AS BEGIN INSERT INTO @values(value) select units.value + tens.value + hundreds.value + Thousands.value + TenThousands.value + CThousands.value + Millions.value AS list from( select 0 as value union all select 1 as value where 1 <= @last union all select 2 as value where 2 <= @last union all select 3 as value where 3 <= @last union all select 4 as value where 4 <= @last union all select 5 as value where 5 <= @last union all select 6 as value where 6 <= @last union all select 7 as value where 7 <= @last union all select 8 as value where 8 <= @last union all select 9 as value where 9 <= @last ) AS Units , ( select 0 as value union all select 10 as value where 10 <= @last union all select 20 as value where 20 <= @last union all select 30 as value where 30<= @last union all select 40 as value where 40 <= @last union all select 50 as value where 50 <= @last union all select 60 as value where 60 <= @last union all select 70 as value where 70 <= @last union all select 80 as value where 80 <= @last union all select 90 as value where 90 <= @last ) AS Tens, ( select 0 as value union all select 100 as value where 100 <= @last union all select 200 as value where 200 <= @last union all select 300 as value where 300 <= @last union all select 400 as value where 400 <= @last union all select 500 as value where 500 <= @last union all select 600 as value where 600 <= @last union all select 700 as value where 700 <= @last union all select 800 as value where 800 <= @last union all select 900 as value where 900 <= @last ) AS Hundreds, ( select 0 as value union all select 1000 as value where 1000 <= @last union all select 2000 as value where 2000 <= @last union all select 3000 as value where 3000 <= @last union all select 4000 as value where 4000 <= @last union all select 5000 as value where 5000 <= @last union all select 6000 as value where 6000 <= @last union all select 7000 as value where 7000 <= @last union all select 8000 as value where 8000 <= @last union all select 9000 as value where 9000 <= @last ) AS Thousands, ( select 0 as value union all select 10000 as value where 10000 <= @last union all select 20000 as value where 20000 <= @last union all select 30000 as value where 30000 <= @last union all select 40000 as value where 40000 <= @last union all select 50000 as value where 50000 <= @last union all select 60000 as value where 60000 <= @last union all select 70000 as value where 70000 <= @last union all select 80000 as value where 80000 <= @last union all select 90000 as value where 90000 <= @last ) AS TenThousands, ( select 0 as value union all select 100000 as value where 100000 <= @last union all select 200000 as value where 200000 <= @last union all select 300000 as value where 300000 <= @last union all select 400000 as value where 400000 <= @last union all select 500000 as value where 500000 <= @last union all select 600000 as value where 600000 <= @last union all select 700000 as value where 700000 <= @last union all select 800000 as value where 800000 <= @last union all select 900000 as value where 900000 <= @last ) AS CThousands, ( select 0 as value union all select 1000000 as value where 1000000 <= @last union all select 2000000 as value where 2000000 <= @last union all select 3000000 as value where 3000000 <= @last union all select 4000000 as value where 4000000 <= @last union all select 5000000 as value where 5000000 <= @last union all select 6000000 as value where 6000000 <= @last union all select 7000000 as value where 7000000 <= @last union all select 8000000 as value where 8000000 <= @last union all select 9000000 as value where 9000000 <= @last ) AS Millions where units.value + tens.value + hundreds.value + Thousands.value + TenThousands.value + CThousands.value + Millions.value between @first and @last RETURN END GO

Verbose though the above function is, it is streets ahead of the others in terms of performance. I decided to combine the filtering techniques in this script with the elegance of Chris's version to produce a final version.

### The final version

This final version offers similar performance to the verbose version. but, to my eyes at least, gains in readability.

ALTER FUNCTION dbo.fnCrossJoinRange2 ( @first int , --##PARAM @first The lowest value in the range. @last int --##PARAM @last The highest value in the range. ) RETURNS @values TABLE ( value int primary key ) AS BEGIN INSERT INTO @values(value) SELECT num = units.value + (tens.value) + (hundreds.value ) + (Thousands.value ) + (TenThousands.value ) + (CThousands.value ) + (Millions.value ) FROM dbo.Digits units CROSS JOIN (SELECT value * 10 as value from dbo.Digits WHERE value * 10 <=@last) tens CROSS JOIN (SELECT value * 100 as value from dbo.Digits WHERE value * 100 <=@last) hundreds CROSS JOIN (SELECT value * 1000 as value from dbo.Digits WHERE value * 1000 <=@last) Thousands CROSS JOIN (SELECT value * 10000 as value from dbo.Digits WHERE value * 10000 <=@last) TenThousands CROSS JOIN (SELECT value * 100000 as value from dbo.Digits WHERE value * 100000 <=@last) CThousands CROSS JOIN (SELECT value * 1000000 as value from dbo.Digits WHERE value * 1000000 <=@last) Millions where units.value + (tens.value ) + (hundreds.value) + (Thousands.value ) + (TenThousands.value ) + (CThousands.value ) + (Millions.value ) BETWEEN @first and @last RETURN END GO

### Conclusions

The important thing to remember with SQL is that it is optimised for set based operations. A piece of code that would be very efficient in a procedural language won't necessarily be the best way to perform a task in T-SQL. This trips up most newbies and also the ORACLE crowd who tend to be much more dependent on cursors in their code.

One insert of 10,000 records is better than 10,000 inserts of 1 record.

However, I found that for small ranges, say under 500 records, there was little difference in performance between the procedural and set based versions. Once beyond the 500 threshold the set based version won hands down.