SQL Server Central is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
Search:  
 
 

Writing Faster T-SQL

By Jacob Sebastian, 2007/05/22

Total article views: 25334 | Views in the last 30 days: 489

Introduction

This is the first installment of a series of articles which would look into some of the performance tips that could be applicable to the day-to-day programming life of most SQL Server programmers. Anyone would agree that performance is the key to the success of any application. In this series, I will try to present some of the common scenarios and explain a few alternate approaches which would provide better performance.

This morning I read the article by Wayne Fillis where he presented an excellent explanation of the NOLOCK query optimizer hint. To demonstrate the performance benefits, he populated a table with 5 million records and ran the queries with and without NOLOCK hint. He warns that the TSQL code that generated 5 million rows will take more than an hours on a high-end machine and may take a few hours on a machine with lower configuration.

Many times we need to generate large number of rows to do various performance tests. However, most of the times, it takes a few hours to generate the data that we need. It would be interesting to look into alternate ways to generate data quickly and efficiently. This article presents a very efficient approach to generate large number of rows.

Most of the time, I have seen people writing a loop that runs for N times and executes an INSERT statement. Thus, to generate 5 million rows, you need to execute 5 million INSERT statements. This adds to the major part of the delay that we experience while generating data. If we could use a single INSERT statement that inserts 5 million records at one go, we could gain amazing performance benefits.

Approach

As I mentioned above, we would be using a different approach to generate records. We will use a single INSERT statement that will insert 5 million records to the target table. To do that, essentially, we need to have a SELECT statement that returns the desired number of rows. How do we get into that? The steps below demonstrate that.

The first step is to write a query that returns 10 rows containing numbers 0 to 9. The following TSQL query does that.

    1 SELECT 0 as Number

    2     UNION SELECT 1

    3     UNION SELECT 2

    4     UNION SELECT 3

    5     UNION SELECT 4

    6     UNION SELECT 5

    7     UNION SELECT 6

    8     UNION SELECT 7

    9     UNION SELECT 8

   10     UNION SELECT 9


Now let us write a query that uses the digits above and generate a result set containing 100 records (0 to 100). 

    1 WITH digits AS (

    2     SELECT 0 as Number

    3     UNION SELECT 1

    4     UNION SELECT 2

    5     UNION SELECT 3

    6     UNION SELECT 4

    7     UNION SELECT 5

    8     UNION SELECT 6

    9     UNION SELECT 7

   10     UNION SELECT 8

   11     UNION SELECT 9

   12 )

   13 SELECT (tens.Number * 10) + ones.Number as Number

   14 FROM digits as tens

   15 CROSS JOIN digits as ones

I am using a CTE (Common Table Expression) to simplify the code above.  A CROSS JOIN is used on the previous query and it generates 100 records on the fly. CTE is introduced by SQL Server 2005. You can get the same results in SQL Server 2000 by running the following query.

    1 SELECT (tens.Number * 10) + ones.Number as Number

    2 FROM (

    3     SELECT 0 as Number

    4     UNION SELECT 1

    5     UNION SELECT 2

    6     UNION SELECT 3

    7     UNION SELECT 4

    8     UNION SELECT 5

    9     UNION SELECT 6

   10     UNION SELECT 7

   11     UNION SELECT 8

   12     UNION SELECT 9

   13 ) as tens

   14 CROSS JOIN (

   15     SELECT 0 as Number

   16     UNION SELECT 1

   17     UNION SELECT 2

   18     UNION SELECT 3

   19     UNION SELECT 4

   20     UNION SELECT 5

   21     UNION SELECT 6

   22     UNION SELECT 7

   23     UNION SELECT 8

   24     UNION SELECT 9

   25 ) as ones

The above example clearly shows how useful it is to use a CTE. At the next step, let us generate 10 million records.

    1 WITH digits AS (

    2     SELECT 0 as Number

    3     UNION SELECT 1

    4     UNION SELECT 2

    5     UNION SELECT 3

    6     UNION SELECT 4

    7     UNION SELECT 5

    8     UNION SELECT 6

    9     UNION SELECT 7

   10     UNION SELECT 8

   11     UNION SELECT 9

   12 )

   13 SELECT

   14       (millions.Number * 1000000)

   15     + (hThousands.Number * 100000)

    16      + (tThousands.Number * 10000)

    17      + (thousands.Number * 1000)

    18      + (hundreds.Number * 100)

    19     + (tens.Number * 10)

    20     + ones.Number AS Number

    21 FROM digits AS ones

    22 CROSS JOIN digits AS tens

    23 CROSS JOIN digits AS hundreds

    24 CROSS JOIN digits AS thousands

    25 CROSS JOIN digits AS tThousands

   26 CROSS JOIN digits AS hThousands

   27 CROSS JOIN digits AS millions

The above TSQL generates 10 million records on the fly. It may take 1 to 2 minutes depending upon the configuration of your computer. Now that we have the required number of records, we can write the INSERT statement to populate the table that we need.

For the purpose of this example, I would like to take the same example Wayne Fillis presented. I am going to re-write the first example he presented, using the new approach we discussed above. The following query generates 5 million records and inserts them to the ORDERS table. 

    1 WITH

    2 -- first CTE which returns 10 rows (0-9)

    3 digits AS (

    4     SELECT 0 as Number

    5     UNION SELECT 1

    6     UNION SELECT 2

    7     UNION SELECT 3

    8     UNION SELECT 4

    9     UNION SELECT 5

   10     UNION SELECT 6

   11     UNION SELECT 7

   12     UNION SELECT 8

   13     UNION SELECT 9

   14 )

   15 -- second CTE which returns 10 million rows by using

   16 -- a CROSS JOIN on the first CTE

   17 , dig AS (

   18     SELECT

   19           (millions.Number * 1000000)

   20         + (hThousands.Number * 100000)

   21         + (tThousands.Number * 10000)

   22         + (thousands.Number * 1000)

   23         + (hundreds.Number * 100)

   24         + (tens.Number * 10)

   25         + ones.Number AS Number

   26     FROM digits AS ones

   27     CROSS JOIN digits AS tens

   28     CROSS JOIN digits AS hundreds

   29     CROSS JOIN digits AS thousands

   30     CROSS JOIN digits AS tThousands

   31     CROSS JOIN digits AS hThousands

   32     CROSS JOIN digits AS millions

   33 )

   34 -- Third CTE which generates a "Product ID" and "Order Date".

   35 -- Product ID is repeated after 500,000 records.

   36 -- after every 1000 records, "order date" moves backward.

   37 , prod AS (

   38 SELECT number, (number % 500000) + 1 AS ProductID, GETDATE() - (number / 1000) AS OrderDate

   39 FROM dig

   40 WHERE number < 5000000 -- we need only 5 million records :-)

   41 )

   42 -- the insert statement goes here

   43 INSERT    Orders (ProductID, OrderDate)

   44     SELECT productID,

   45     DATEADD(dd,0, DATEDIFF(dd,0,OrderDate)) -- strips off "time" from date value

   46     FROM prod

The above code runs in 2.45 minutes on my laptop (Toshiba Satellite Pro, Solo Core, 1 GB RAM). It runs much faster compared to the loop that inserts same number of records in close to 90 minutes.

Conclusions

I referred to the example presented by Wayne for the purpose of demonstration only. His article presented a practical scenario where we need large number of records generated for various performance tests. This article presents a more optimized TSQL code that generates data.

By Jacob Sebastian, 2007/05/22

Total article views: 25334 | Views in the last 30 days: 489
Your response
 
 
Related tags

Performance Tuning    
T-SQL    
 
Like this? Try these...

Moving Databases

By Andy Warren | Category: Administration
| 6,904 reads
Already registered?  

Free registration required

To read the rest of this article, and access thousands of other articles, we ask you to register on the site and subscribe to our newsletters.

Register

E-mail address:
Password:
Password (confirm):

  

Subscriptions

We ask you to register on the site and subscribe to our newsletters. Subscribing to our newsletters gets you:

  • ALL of our content (thousands of articles, scripts, and forum postings)
  • A daily newsletter (example)
  • A weekly news round up (example)
  • The opportunity to ask and answer questions in our forums
  • A daily Question of the Day to test and help you increase your knowledge of SQL Server.

We ask that you give the newsletter a try for a week. Over 200,000 SQL Server Professionals a day find it entertaining and useful. If not, you are welcome to unsubscribe at anytime.

Steve Jones
Editor, SQLServerCentral.com