I'll bet anyone who has worked with Access or SQL for more than a few hours understands what an autonumber/identity column does - and how useful it is for creating "meaningless" primary keys. Setting up an identity column has always been easy. It's an int column, good for indexing. There are a couple items that trip up the unwary user though:
- Although the values are sequential, there is no guarantee that there will not be gaps in the sequence. If you HAVE to have sequential ID's, you have to code your own solution.
- Identity cols don't necessarily exist on other database platforms. If you're trying to be platform agnostic, you're on your own.
- In SQL 6.5 and 7.0, you retrieve the value by selecting @@identity after doing an insert. Which works fine unless there is a trigger on the table that does an insert into a different table that
also has an identity column - you get back the ID from the second table, not the first. Even if you don't have this problem when you build, someone may add a trigger later on to do auditing - then your code breaks. With SQL 2000 you can use the Scope_Identity() function to avoid this problem.
- If you're using merge replication, you have to deal with managing "ranges" for each subscriber.
SQL helps, but it's not a lot of fun.
- A pretty common task is to an order/order details insert. If you're using an identity column for the order id, you have two choices. One is to insert the order, return the id, then process the
details (which need the parent order id). This works fine, but results in an extra round trip. The other is to add enough parameters to your order insert stored proc to accommodate
the details as well. This saves the round trip (unless you still need the ID or ID's returned to the client)
at the expense of some hideous code in your stored procedure.
Using a column of uniqueidentifier can be a useful way to avoid some of these issues. For those
of you not familiar with this data type, its a 16 byte value that is guaranteed to be unique in
the world. In SQL you generate one using the NewID() function. Although I wasn't able to find out
the exact details of how it's constructed, MSDN says it is based on a combination of the MAC
address from the network card, the current date time, and possibly some other values. They
typically look like this:
Some obvious downsides are that it's a bigger data type, 16 bytes versus 4 for the int, which will affect performance to some degree. They are harder to work with mentally I think (picture yourself telling a co-worker to check order id '6F9619FF-8B86-D011-B42D-00C04FC964FF').
If you want a uniqueidentifier column to be auto populated like an identity column, you have to set the default for the column to be NewID(). Setting the IsRowGuid property automatically adds the NewID() default using EM in SQL 2000 (I didn't have SQL 7 available for testing) and lets SQL that this value uniquely identifies the row. This is not quite the same as making the column the primary key. If you don't mark your uniqueidentifier column as the rowguid when you configure merge replication it will add ANOTHER uniqueidentifier column to your table.
With that intro done, let's evaluate how using the uniqueidentifier addresses the problems I listed with identity columns:
- Sequential values. Doesn't help. Same problem with identity, you'll have to write your own solution.
- Platform independent. Helps as long as you're running a Microsoft OS since we rely on it for the ID generation. ID can be stored as a string for databases that do not support uniqueidentifier directly. For other OS's you could code your own replacement for NewID().
- Trigger. Helps. @@Identity doesn't work with uniqueidentifiers. If you want the value for the row just inserted, you'll have to select it back.
- Merge replication. Helps. You actually save 4 bytes since it will add a unique identifier column if you don't have one. You might still be better off keeping the identity col if it is your primary key since it will be more effective for indexing (especially if it is your clustered index).
- Order details. Helps. You can generate the ID on the client instead of the server. No round trips.
The last one is the biggie. You don't have to use NewID() to generate these values, you can generate them on any Windows client with a small bit of code. The following code is a portion of an article from MSDN. Add all of this to a module in VB:
|Private Type GUID
Data1 As Long
Data2 As Integer
Data3 As Integer
Data4(7) As Byte
Private Declare Function CoCreateGuid Lib "OLE32.DLL" (pGuid As GUID) As Long
Public Function GetGUID() As String
'(c) 2000 Gus Molina
Dim udtGUID As GUID
If (CoCreateGuid(udtGUID) = 0) Then
GetGUID = _
String(8 - Len(Hex$(udtGUID.Data1)), "0") & Hex$(udtGUID.Data1) & _
String(4 - Len(Hex$(udtGUID.Data2)), "0") & Hex$(udtGUID.Data2) & _
String(4 - Len(Hex$(udtGUID.Data3)), "0") & Hex$(udtGUID.Data3) & _
IIf((udtGUID.Data4(0) < &H10), "0", "") & Hex$(udtGUID.Data4(0)) & _
IIf((udtGUID.Data4(1) < &H10), "0", "") & Hex$(udtGUID.Data4(1)) & _
IIf((udtGUID.Data4(2) < &H10), "0", "") & Hex$(udtGUID.Data4(2)) & _
IIf((udtGUID.Data4(3) < &H10), "0", "") & Hex$(udtGUID.Data4(3)) & _
IIf((udtGUID.Data4(4) < &H10), "0", "") & Hex$(udtGUID.Data4(4)) & _
IIf((udtGUID.Data4(5) < &H10), "0", "") & Hex$(udtGUID.Data4(5)) & _
IIf((udtGUID.Data4(6) < &H10), "0", "") & Hex$(udtGUID.Data4(6)) & _
IIf((udtGUID.Data4(7) < &H10), "0", "") & Hex$(udtGUID.Data4(7))
If your developers add this code to their apps (or even better, to a small DLL shared across
projects) they can generate their own keys, like this:
dim sKey as string
I also found some sample code on MSDN that you may find useful. Give this technique a try, I think you'll find it comes in handy!