How does the Identity property work?

,

How does the Indentity property work?

Introduction

If you are like most DBAs, you have probably implemented some type of hash function for some

of your tables at one time or another. The most common type of hash is a sequential integer type

sequence that functions as a primary key, without having to use some character value. I'm not debating

whether this is a good idea or not, but there is often a need for it. Usually for invoice numbers, order numbers,

etc. one is looking for a sequence of numbers.

The Problem

When I had first arrived at the this comapny, the programmers had implemented a scheme where the next

key value was stored in a table. Actually a number of key values for different items were stored in

this table. Whenever a process needed a new key bvalue, it ran a stored procedure that retreived the next

key and incremented the value in the table. As the system was beign developed, this was an easy way to

manage the incremental values.

This was not, however, one of my favorite methods of handling this. Instead, I have always preferred using

the IDENTITY property for each table that needs this. I immediately started to convert our system to use

the identity property where needed.

Of course, one of the developers pointed out that SQL Server must be performing the same process that

their system was doing, namely storing the "next" value somewhere, retrieving it, and then incrementing the

stored value. We proceeeded to debate which method was in fact better. As we argued the merits of each

method, I wondered how SQL server's identity property would perform under load.

The Solution

When in doubt, test.

This was exactly what I decided to do. My first step was to setup a table:

create table MyTest2
(
	source char( 1)
	, MyID bigint  identity( 1, 1)
)

Then I setup some data files. These were simple files I generated in Excel and then saved

as text files. Each contains a single character representing the source file and a numeric field

that functions as a placeholder for the bcp utility. These are available for download here:

loada.txt, loadb.txt,

and loadc.txt (not very

original names, but they served the purpose). Once I had these data files, I decided to

conduct my identity test. Since I was going to use BCP and not include an identity value (since I

was testing the identity process), I included placeholder values to avoid generating format files.

I opened three command

windows and entered the following:

bcp "iqd_dev..MyTest2" in c:\loadc.txt -STank -T -c

substituting the correct file name in each window. Then I pressed <Enter> in one

window, quickly switched to the next, pressed <Enter> again, and repeated the process in the

third window. Here is a shot of my desktop after I ran these:

Of course, this took a few tries to get working correctly. The timing of the mouse clicks and <

Enter> presses took some practice and I dropped and recreated my table more than a few times.

The Possible Results

Before I reveal what happened, there were a couple of possible outcomes that I had anticipated. Either

the server would treat each BCP load as a transaction, in which case I would expect to find rows 1-

30,000 with a single source, rows 30,001-60,000 with a second source and 60,001-90,000 with the third

course. If I were implementing my own identity, this is likely what would happen (if each load were

its own transaction). The other alternative would be that despite each load being a transaction, the

server had some internal optimizations and I would receive a spread of sources among all identity values.

What I found in my results, after successfully having simultaneous loads (more or less), was that SQL

Server was able to generate these identity values extremely quickly. The load of these 90,000 rows

completed in less than 5 seconds on numerous occassions. Looking at a sample result set of data, you can

see my results:


Conclusions

I don't necesarily recommend identity values for all tables. There are many places that I have found

where a sequential, numeric key can be helpful, however, and for these situations, I do highly recommend

the identity property over any other type of sequence generator.

As always I welcome feedback on this article using the "Your Opinion" button below. Please also

rate this article.

Steve Jones

©dkRanch.net October 2001


Return to Steve Jones Home

 

Rate

Share

Share

Rate