Click here to monitor SSC
SQLServerCentral is supported by Redgate
 
Log in  ::  Register  ::  Not logged in
 
 
 


Overusing Identities


Overusing Identities

Author
Message
Steve Jones
Steve Jones
SSC-Dedicated
SSC-Dedicated (35K reputation)SSC-Dedicated (35K reputation)SSC-Dedicated (35K reputation)SSC-Dedicated (35K reputation)SSC-Dedicated (35K reputation)SSC-Dedicated (35K reputation)SSC-Dedicated (35K reputation)SSC-Dedicated (35K reputation)

Group: Administrators
Points: 35992 Visits: 18728
Comments posted to this topic are about the item Overusing Identities

Follow me on Twitter: @way0utwest
Forum Etiquette: How to post data/code on a forum to get the best help
My Blog: www.voiceofthedba.com
Jeff Moden
Jeff Moden
SSC-Forever
SSC-Forever (44K reputation)SSC-Forever (44K reputation)SSC-Forever (44K reputation)SSC-Forever (44K reputation)SSC-Forever (44K reputation)SSC-Forever (44K reputation)SSC-Forever (44K reputation)SSC-Forever (44K reputation)

Group: General Forum Members
Points: 44952 Visits: 39862
I use identities for the same reason that people are issued SSN's... It makes life real simple as a unique identifier in the world of large batch programming. As I prepare to install handrails for the upcoming ride :-P, I'll also tell you that if a natural key exists, I'll still usually use an IDENTITY column as at least a Non-null Unique Key and make the natural key the PK unless, maybe, if it's a totally static table.

I'll usually put the clustered index on an IDENTITY column especially for large tables that are also highly transactional just to keep page splits at the lowest level possible. Sure, a date column would probably suffice but there isn't always a date column and, when there is, data is not always inserted in the same order as those dates. To me, it's worth the 4 bytes per row to keep from having nearly 8k bytes suddenly open up because a single row insert caused a page split. The extra 4 byte column is usually worth it to me in the savings it causes during maintenance.

I also tend to use IDENTITY columns on everything I import. I always import to staging tables and never to the final table so that if someone does send "dupes", I can easily find them and tell the vendor of their problem. Yes, there are several ways to avoid the dupes but I always want to capture them for one reason or another and the IDENTITY column provides uniqueness when there may otherwise be none. The use of IDENTITY columns also makes it a snap to archive data because I can usually archive rows on one logical end of the table while the users continue to use the more recent other "end" of the table without mutual interference.

The bottom line is that as with everything else, "It Depends". I can see arguments for using IDENTITIES for all tables, no tables, or just some tables. I happen to be one of those that favors the use of IDENTITY columns.

--Jeff Moden

RBAR is pronounced ree-bar and is a Modenism for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
     Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column.
Although they tell us that they want it real bad, our primary goal is to ensure that we dont actually give it to them that way.
Although change is inevitable, change for the better is usually not.
Just because you can do something in PowerShell, doesnt mean you should. Wink

Helpful Links:
How to post code problems
How to post performance problems
Forum FAQs
george sibbald
george sibbald
SSCertifiable
SSCertifiable (6.3K reputation)SSCertifiable (6.3K reputation)SSCertifiable (6.3K reputation)SSCertifiable (6.3K reputation)SSCertifiable (6.3K reputation)SSCertifiable (6.3K reputation)SSCertifiable (6.3K reputation)SSCertifiable (6.3K reputation)

Group: General Forum Members
Points: 6324 Visits: 13686
Just to note that using a GUID as a primary key might not be a good idea if the index is created as a clustered index due to the random values given to GUIDs, and their length at 16 bytes.

NEWSEQUENTIALID() can be used in conjunction with GUIDs if they are to be the clustered index.

---------------------------------------------------------------------
SQLRNNR
SQLRNNR
SSC-Insane
SSC-Insane (21K reputation)SSC-Insane (21K reputation)SSC-Insane (21K reputation)SSC-Insane (21K reputation)SSC-Insane (21K reputation)SSC-Insane (21K reputation)SSC-Insane (21K reputation)SSC-Insane (21K reputation)

Group: General Forum Members
Points: 21055 Visits: 18258
Jeff Moden (2/27/2010)
I use identities for the same reason that people are issued SSN's... It makes life real simple as a unique identifier in the world of large batch programming. ...
I'll usually put the clustered index on an IDENTITY column especially for large tables that are also highly transactional just to keep page splits at the lowest level possible. ...
I also tend to use IDENTITY columns on everything I import. ...

The bottom line is that as with everything else, "It Depends". I can see arguments for using IDENTITIES for all tables, no tables, or just some tables. I happen to be one of those that favors the use of IDENTITY columns.



I favor the use of identity columns for much the same reasons. I don't like clustered indexes on GUID columns, and prefer to use a better natural key or an identity column. Jeff pretty much summed it up for me though.



Jason AKA CirqueDeSQLeil
I have given a name to my pain...
MCM SQL Server, MVP


SQL RNNR

Posting Performance Based Questions - Gail Shaw

Steve Cullen
Steve Cullen
SSChasing Mays
SSChasing Mays (603 reputation)SSChasing Mays (603 reputation)SSChasing Mays (603 reputation)SSChasing Mays (603 reputation)SSChasing Mays (603 reputation)SSChasing Mays (603 reputation)SSChasing Mays (603 reputation)SSChasing Mays (603 reputation)

Group: General Forum Members
Points: 603 Visits: 1226
We create identity columns for just about every table. It tends to make things much easier. One such area this makes things easier in is transactional replication.

One area we don't do this with is in staging tables for ETL import. I tend to go back and forth on the whole identity idea there because there are occasionally dups to deal with, as Jeff has pointed out. It's never been a major issue though.

Everywhere else we use identity, whether we have a natural key or not.

Converting oxygen into carbon dioxide, since 1955.


GilaMonster
GilaMonster
SSC-Forever
SSC-Forever (47K reputation)SSC-Forever (47K reputation)SSC-Forever (47K reputation)SSC-Forever (47K reputation)SSC-Forever (47K reputation)SSC-Forever (47K reputation)SSC-Forever (47K reputation)SSC-Forever (47K reputation)

Group: General Forum Members
Points: 47165 Visits: 44346
I don't have a problem with using identities. What I do have a problem with is using identities as the only unique identifier of the entire table.

When the primary key is an identity (or a GUID) and there is no other unique column or set of columns in the table, then it's easy to get into the situation where the only thing that differs between two rows is a meaningless artificial key. That easily leads to unamusing data integrity issues at a later date.

It's important to identify the candidate keys during modelling. Maybe one gets selected as the primary, maybe an artificial key gets used instead, doesn't really matter, but then the candidate keys should get unique constraints defined on them so that the meaningless artificial key is not the sole measure of uniqueness in the table.


Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass


Michael Lysons
Michael Lysons
Ten Centuries
Ten Centuries (1.3K reputation)Ten Centuries (1.3K reputation)Ten Centuries (1.3K reputation)Ten Centuries (1.3K reputation)Ten Centuries (1.3K reputation)Ten Centuries (1.3K reputation)Ten Centuries (1.3K reputation)Ten Centuries (1.3K reputation)

Group: General Forum Members
Points: 1256 Visits: 1392
I've used Identity columns a lot in my data warehouse work, and I love the simplicity they bring. I've also used them extensively when working on transactional systems, and again found that the simplicity made for easier modelling.

However, more recently I created a new system and decided to go for entirely natural keys. For no other reason than I fancied a change to what it was like, and so I ended up with natural, composite keys that brought a certain elegance to the design.

I'd say it was more work with the natural keys though. And if I was building the same system again, I'm not sure which way I'd go.
bill.sugden
bill.sugden
SSC-Enthusiastic
SSC-Enthusiastic (113 reputation)SSC-Enthusiastic (113 reputation)SSC-Enthusiastic (113 reputation)SSC-Enthusiastic (113 reputation)SSC-Enthusiastic (113 reputation)SSC-Enthusiastic (113 reputation)SSC-Enthusiastic (113 reputation)SSC-Enthusiastic (113 reputation)

Group: General Forum Members
Points: 113 Visits: 144
When I was a newbie to DB development (from a programmeer background) I liberally spread IDENTITY columns as PK. Simpler for me I thought and easier for the other developers.

Now I'm older and wiser. As Gail says, this can bite you if you are not careful. Now I try very hard to identify natural keys, and if they are there, even if composite keys are required, they are at least given a unique index. but the convenience of using IDENTITY for internal work and easier joining is very great, and I would still recommend their usage.

By the way, I hated Aaron's example of an order detail table. Of course it won't work with a simple Order_ID + Product_ID as PK. I would hope that any order detail table would have a unique ID of Order_ID + LineNo. And this is one case where an identity column IS redundant.

Bill
sqlvogel
sqlvogel
SSC-Addicted
SSC-Addicted (484 reputation)SSC-Addicted (484 reputation)SSC-Addicted (484 reputation)SSC-Addicted (484 reputation)SSC-Addicted (484 reputation)SSC-Addicted (484 reputation)SSC-Addicted (484 reputation)SSC-Addicted (484 reputation)

Group: General Forum Members
Points: 484 Visits: 3706
I'm with Gail on this. What you choose as a primary key is not important. Candidate keys are the important issue. Just be sure you are enforcing the candidate keys required for the integrity of the data. Don't fall into the trap of thinking that "primary key" means a table has only one key or that one key is more important than others.
Filipe
Filipe
SSC-Enthusiastic
SSC-Enthusiastic (198 reputation)SSC-Enthusiastic (198 reputation)SSC-Enthusiastic (198 reputation)SSC-Enthusiastic (198 reputation)SSC-Enthusiastic (198 reputation)SSC-Enthusiastic (198 reputation)SSC-Enthusiastic (198 reputation)SSC-Enthusiastic (198 reputation)

Group: General Forum Members
Points: 198 Visits: 345
Actually I must disagree and say that Identity columns are nice and convinient but can be very bad as well. Why would someone define an identity column on a table that already has another unique key, specially if that column is an int or bigint?
Recently I had living proof that is not a good practice.
I have one database that was developed with the maxim that identities are never enough, and this one table had a natural key (unique, int) that was theonly field used in the application, and had the identity, that was the clustered PK.
I started noticing dead-locks on that table, and the dead-lock was between updates on the PK and the unique key. Since all access to the table was via the UK, when deleting from that table the order of the locks was the reverse of the selects, and it resulted in dead-locks. Getting rid of the PK (and the identity all together) got rid of the problem as well.
Go


Permissions

You can't post new topics.
You can't post topic replies.
You can't post new polls.
You can't post replies to polls.
You can't edit your own topics.
You can't delete your own topics.
You can't edit other topics.
You can't delete other topics.
You can't edit your own posts.
You can't edit other posts.
You can't delete your own posts.
You can't delete other posts.
You can't post events.
You can't edit your own events.
You can't edit other events.
You can't delete your own events.
You can't delete other events.
You can't send private messages.
You can't send emails.
You can read topics.
You can't vote in polls.
You can't upload attachments.
You can download attachments.
You can't post HTML code.
You can't edit HTML code.
You can't post IFCode.
You can't post JavaScript.
You can post emoticons.
You can't post or upload images.

Select a forum

































































































































































SQLServerCentral


Search