Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 
        
Home       Members    Calendar    Who's On


Add to briefcase 12345»»»

Overusing Identities Expand / Collapse
Author
Message
Posted Saturday, February 27, 2010 11:39 AM


SSC-Dedicated

SSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-Dedicated

Group: Administrators
Last Login: Today @ 2:00 PM
Points: 33,062, Visits: 15,175
Comments posted to this topic are about the item Overusing Identities






Follow me on Twitter: @way0utwest

Forum Etiquette: How to post data/code on a forum to get the best help
Post #874032
Posted Saturday, February 27, 2010 12:59 PM


SSC-Dedicated

SSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-Dedicated

Group: General Forum Members
Last Login: Today @ 1:55 PM
Points: 36,749, Visits: 31,197
I use identities for the same reason that people are issued SSN's... It makes life real simple as a unique identifier in the world of large batch programming. As I prepare to install handrails for the upcoming ride , I'll also tell you that if a natural key exists, I'll still usually use an IDENTITY column as at least a Non-null Unique Key and make the natural key the PK unless, maybe, if it's a totally static table.

I'll usually put the clustered index on an IDENTITY column especially for large tables that are also highly transactional just to keep page splits at the lowest level possible. Sure, a date column would probably suffice but there isn't always a date column and, when there is, data is not always inserted in the same order as those dates. To me, it's worth the 4 bytes per row to keep from having nearly 8k bytes suddenly open up because a single row insert caused a page split. The extra 4 byte column is usually worth it to me in the savings it causes during maintenance.

I also tend to use IDENTITY columns on everything I import. I always import to staging tables and never to the final table so that if someone does send "dupes", I can easily find them and tell the vendor of their problem. Yes, there are several ways to avoid the dupes but I always want to capture them for one reason or another and the IDENTITY column provides uniqueness when there may otherwise be none. The use of IDENTITY columns also makes it a snap to archive data because I can usually archive rows on one logical end of the table while the users continue to use the more recent other "end" of the table without mutual interference.

The bottom line is that as with everything else, "It Depends". I can see arguments for using IDENTITIES for all tables, no tables, or just some tables. I happen to be one of those that favors the use of IDENTITY columns.


--Jeff Moden
"RBAR is pronounced "ree-bar" and is a "Modenism" for "Row-By-Agonizing-Row".

First step towards the paradigm shift of writing Set Based code:
Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column."

(play on words) "Just because you CAN do something in T-SQL, doesn't mean you SHOULDN'T." --22 Aug 2013

Helpful Links:
How to post code problems
How to post performance problems
Post #874053
Posted Saturday, February 27, 2010 1:18 PM
SSCertifiable

SSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiable

Group: General Forum Members
Last Login: Today @ 11:08 AM
Points: 5,975, Visits: 12,878
Just to note that using a GUID as a primary key might not be a good idea if the index is created as a clustered index due to the random values given to GUIDs, and their length at 16 bytes.

NEWSEQUENTIALID() can be used in conjunction with GUIDs if they are to be the clustered index.


---------------------------------------------------------------------

Post #874056
Posted Saturday, February 27, 2010 3:39 PM


SSC-Insane

SSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-Insane

Group: General Forum Members
Last Login: Today @ 2:39 PM
Points: 21,210, Visits: 14,902
Jeff Moden (2/27/2010)
I use identities for the same reason that people are issued SSN's... It makes life real simple as a unique identifier in the world of large batch programming. ...
I'll usually put the clustered index on an IDENTITY column especially for large tables that are also highly transactional just to keep page splits at the lowest level possible. ...
I also tend to use IDENTITY columns on everything I import. ...

The bottom line is that as with everything else, "It Depends". I can see arguments for using IDENTITIES for all tables, no tables, or just some tables. I happen to be one of those that favors the use of IDENTITY columns.



I favor the use of identity columns for much the same reasons. I don't like clustered indexes on GUID columns, and prefer to use a better natural key or an identity column. Jeff pretty much summed it up for me though.




Jason AKA CirqueDeSQLeil
I have given a name to my pain...
MCM SQL Server


SQL RNNR

Posting Performance Based Questions - Gail Shaw
Posting Data Etiquette - Jeff Moden
Hidden RBAR - Jeff Moden
VLFs and the Tran Log - Kimberly Tripp
Post #874073
Posted Sunday, February 28, 2010 8:17 PM
Mr or Mrs. 500

Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500

Group: General Forum Members
Last Login: Wednesday, September 19, 2012 8:39 AM
Points: 595, Visits: 1,226
We create identity columns for just about every table. It tends to make things much easier. One such area this makes things easier in is transactional replication.

One area we don't do this with is in staging tables for ETL import. I tend to go back and forth on the whole identity idea there because there are occasionally dups to deal with, as Jeff has pointed out. It's never been a major issue though.

Everywhere else we use identity, whether we have a natural key or not.


Converting oxygen into carbon dioxide, since 1955.

Post #874283
Posted Monday, March 1, 2010 12:42 AM


SSC-Forever

SSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-Forever

Group: General Forum Members
Last Login: Today @ 2:18 PM
Points: 42,449, Visits: 35,504
I don't have a problem with using identities. What I do have a problem with is using identities as the only unique identifier of the entire table.

When the primary key is an identity (or a GUID) and there is no other unique column or set of columns in the table, then it's easy to get into the situation where the only thing that differs between two rows is a meaningless artificial key. That easily leads to unamusing data integrity issues at a later date.

It's important to identify the candidate keys during modelling. Maybe one gets selected as the primary, maybe an artificial key gets used instead, doesn't really matter, but then the candidate keys should get unique constraints defined on them so that the meaningless artificial key is not the sole measure of uniqueness in the table.



Gail Shaw
Microsoft Certified Master: SQL Server 2008, MVP
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

Post #874325
Posted Monday, March 1, 2010 2:41 AM
Ten Centuries

Ten CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen Centuries

Group: General Forum Members
Last Login: Today @ 3:41 AM
Points: 1,148, Visits: 1,070
I've used Identity columns a lot in my data warehouse work, and I love the simplicity they bring. I've also used them extensively when working on transactional systems, and again found that the simplicity made for easier modelling.

However, more recently I created a new system and decided to go for entirely natural keys. For no other reason than I fancied a change to what it was like, and so I ended up with natural, composite keys that brought a certain elegance to the design.

I'd say it was more work with the natural keys though. And if I was building the same system again, I'm not sure which way I'd go.
Post #874344
Posted Monday, March 1, 2010 3:14 AM
SSC-Enthusiastic

SSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-Enthusiastic

Group: General Forum Members
Last Login: Wednesday, July 13, 2011 4:11 AM
Points: 113, Visits: 144
When I was a newbie to DB development (from a programmeer background) I liberally spread IDENTITY columns as PK. Simpler for me I thought and easier for the other developers.

Now I'm older and wiser. As Gail says, this can bite you if you are not careful. Now I try very hard to identify natural keys, and if they are there, even if composite keys are required, they are at least given a unique index. but the convenience of using IDENTITY for internal work and easier joining is very great, and I would still recommend their usage.

By the way, I hated Aaron's example of an order detail table. Of course it won't work with a simple Order_ID + Product_ID as PK. I would hope that any order detail table would have a unique ID of Order_ID + LineNo. And this is one case where an identity column IS redundant.

Bill
Post #874351
Posted Monday, March 1, 2010 4:26 AM
SSC-Addicted

SSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-Addicted

Group: General Forum Members
Last Login: Today @ 12:32 PM
Points: 440, Visits: 3,268
I'm with Gail on this. What you choose as a primary key is not important. Candidate keys are the important issue. Just be sure you are enforcing the candidate keys required for the integrity of the data. Don't fall into the trap of thinking that "primary key" means a table has only one key or that one key is more important than others.

David
Post #874367
Posted Monday, March 1, 2010 5:52 AM
SSC-Enthusiastic

SSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-Enthusiastic

Group: General Forum Members
Last Login: Monday, June 30, 2014 7:33 AM
Points: 176, Visits: 292
Actually I must disagree and say that Identity columns are nice and convinient but can be very bad as well. Why would someone define an identity column on a table that already has another unique key, specially if that column is an int or bigint?
Recently I had living proof that is not a good practice.
I have one database that was developed with the maxim that identities are never enough, and this one table had a natural key (unique, int) that was theonly field used in the application, and had the identity, that was the clustered PK.
I started noticing dead-locks on that table, and the dead-lock was between updates on the PK and the unique key. Since all access to the table was via the UK, when deleting from that table the order of the locks was the reverse of the selects, and it resulted in dead-locks. Getting rid of the PK (and the identity all together) got rid of the problem as well.
Post #874373
« Prev Topic | Next Topic »

Add to briefcase 12345»»»

Permissions Expand / Collapse