Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 
        
Home       Members    Calendar    Who's On


Add to briefcase ««1234»»»

Surogate Keys are not always the answer but are freqently used Expand / Collapse
Author
Message
Posted Wednesday, October 13, 2010 4:37 PM


SSCertifiable

SSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiable

Group: General Forum Members
Last Login: Yesterday @ 11:44 AM
Points: 6,236, Visits: 7,380
David Portas (10/13/2010)

No it's not the same for the reasons I already stated: Constraints are standard SQL, unique indexes are not; Constraints are understood and used by more people; Some software tools will recognise keys declared as constraints but will not recognise those created using the CREATE INDEX syntax. In my view those are all very good reasons to use uniquness constraints and avoid creating unique indexes directly.


Here I feel we must disagree. Since finding a free copy of the 97 standard (or 99) has been dubious at best, I can't double check your comment, so I'll just assume it's accurate. Seems to make sense. However, I don't work in the SQL standard. I work in MS SQL Server and T-SQL. Because of that, and because an index can give me performance where a constraint will not, but still takes up the same system space and engine mechanics as a declared index under the hood, I would prefer the index.

You make a good point though about 3rd party tools. I rarely use them so it rarely comes up as a difficulty for me, and that may be part of the difference.

That's more or less what I thought you meant, but it's not technically correct. Two things make a key (aka "candidate key") and they are 1) uniqueness and 2) irreducibility.


...and is also known as the minimal superkey. I'm familiar with the theory. Practice, however, dictates other usage...

Those are the two fundamental qualifications for a key. Stability is usually desirable as well but the fact that a key changes does not make it any less of a key than some other key that doesn't change. Of course that's just terminology I suppose, the effect is exactly the same whether you choose to call it a key or not - the same considerations apply. But I think it's worth saying because a lot of people do seem to get very confused about what keys are and why they are important.

I'm avoiding the term "alternate key" because primary and alternate keys are really the same thing. All are candidate keys.


You are absolutely correct according to the dictionary, and theoretical design. I grant that the terminology is accurate, and I went and double checked just to make sure I remembered it correctly. :)

This, perhaps, is where we are disconnecting in our discussion, because theoretical design is like the ISO standard. Nice to have, but not always practical. Identification of candidate keys are useless in practical design other then as the first step to identifying useful primary and alternate keys, or perhaps to assist in search mechanic selectivity to acquire more speed and better index usage.

To me, a useful key is something that if I found it yesterday with that value, I'll find it tomorrow with the same value, and I can use it to connect to data throughout the design's structure in place of the existing 'Primary Key' if I so chose. While this may not fall properly under the definition of generic db design standards (that I can't seem to get my hands on), it does fall under the definition of what myself and a number of DBAs I've worked with understand them to be in practical usage.

[EDIT:] Aaaaaand I finally look at the top of the screen and realize I'm in the "Relational Theory" Forum... I'll shaddup now. [/EDIT]



- Craig Farrell

Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.

For better assistance in answering your questions | Forum Netiquette
For index/tuning help, follow these directions. |Tally Tables

Twitter: @AnyWayDBA
Post #1004045
Posted Wednesday, October 13, 2010 10:25 PM


SSCarpal Tunnel

SSCarpal TunnelSSCarpal TunnelSSCarpal TunnelSSCarpal TunnelSSCarpal TunnelSSCarpal TunnelSSCarpal TunnelSSCarpal TunnelSSCarpal Tunnel

Group: General Forum Members
Last Login: Tuesday, August 19, 2014 3:55 PM
Points: 4,186, Visits: 4,264
It is usually preferable to use a natural key as opposed to a surrogate key in a relational Model. It depends on the situation

In a Dimensional Model you do not use the natural key a Surrogate key is used.

I can provide additional clarification if necessary.


For better, quicker answers on T-SQL questions, click on the following...
http://www.sqlservercentral.com/articles/Best+Practices/61537/

For better answers on performance questions, click on the following...
http://www.sqlservercentral.com/articles/SQLServerCentral/66909/

Post #1004107
Posted Thursday, October 14, 2010 3:14 AM
SSC-Addicted

SSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-Addicted

Group: General Forum Members
Last Login: Yesterday @ 8:14 AM
Points: 446, Visits: 3,329
Craig Farrell (10/13/2010)

I work in MS SQL Server and T-SQL. Because of that, and because an index can give me performance where a constraint will not, but still takes up the same system space and engine mechanics as a declared index under the hood, I would prefer the index.


I don't understand your claim that an index alone provides better performance than a uniqueness constraint. Can you back that up with an example? I'm pretty sure you'll find they are always exactly the same, in which case the constraint still has all the advantages I mentioned before.


To me, a useful key is something that if I found it yesterday with that value, I'll find it tomorrow with the same value, and I can use it to connect to data throughout the design's structure in place of the existing 'Primary Key' if I so chose. While this may not fall properly under the definition of generic db design standards (that I can't seem to get my hands on), it does fall under the definition of what myself and a number of DBAs I've worked with understand them to be in practical usage.


However, database users would disagree with you. End users don't use surrogate keys to identify data, they use natural keys (actually I much prefer the term "business keys" but it means exactly the same thing). Therefore identifying and enforcing the business keys is normally essential for the users to make sense of the data and to get correct results from it. That includes keys that change. If keys are being used by the business process as identifiers then they are least as important as other keys that don't change - perhaps more so. Keys therefore need to be enforced and properly identified in the database if you want to ensure correct results from it.

Furthermore, the distinction you are trying to make between changing and unchanging keys is ultimately impossible to determine. If all the key values on a row change then there is no sound basis for saying that any of them "changed" - it is simply a different row because rows are identified by their keys. It is usually impractical to prevent data from being changed at some point during its lifetime and often you may not even know if it has been. Therefore the defining property of a key that really matters is uniqueness, not immutability.


David
Post #1004195
Posted Thursday, October 14, 2010 3:29 AM
SSC-Enthusiastic

SSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-Enthusiastic

Group: General Forum Members
Last Login: Yesterday @ 3:23 AM
Points: 112, Visits: 297
Craig Farrell (10/13/2010)

[EDIT:] Aaaaaand I finally look at the top of the screen and realize I'm in the "Relational Theory" Forum... I'll shaddup now. [/EDIT]
Yeah - I learned a few years ago not to get in a disagreement with David about relational theory unless I was absolutely certain of my ground. And even then it turned out I was wrong

Just to throw in another tack on the surrogate Vs natural key debate, there is a certain class of relationship that can't be modelled declaratively using surrogates, only with natural keys. I've sometimes wondered how the "surrogate and surrogates only" brigade handle them.
Post #1004204
Posted Thursday, October 14, 2010 6:32 AM


Hall of Fame

Hall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of Fame

Group: General Forum Members
Last Login: Tuesday, January 28, 2014 8:15 AM
Points: 3,068, Visits: 4,639
Still amazes me - yet it doesn't surprise me anymore - how this "surrogate Vs natural key debate" goes on and on.

Please let me start by stating my position on the issue which is: "Use Natural keys whenever possible, use Surrogate keys whenever necessary(*)"

(*) Like in FACT-DIM relationships in a dimensional model.

Having said that I think I've figured out why this "debate" gets particularly intense in the SQL Server universe - it has to be with clustered indexes on identity columns, please allow me to explain further.

Let me quote Microsoft recommendations on "Clustered Index Design Guidelines" where it reads...

With few exceptions, every table should have a clustered index defined on the column, or columns, that offer the following:
- Can be used for frequently used queries.
- Provide a high degree of uniqueness.
- Can be used in range queries.

. . .

Generally, you should define the clustered index key with as few columns as possible. Consider columns that have one or more of the following attributes:
- Are unique or contain many distinct values
- Are accessed sequentially
- Used frequently to sort the data retrieved from a table
- Defined as IDENTITY because the column is guaranteed to be unique within the table


I personally find these guidelines sound but I also think some DBA construct them as "you must have a clustered index on an identity column" missing a couple of key words in Microsoft's documentation like "with few exceptions" and "columns that have one or more of the following attributes".

My interpretation of the guidelines goes like: a clustered index is helpfull if chosen column is used in certain ways and is already defined in certain ways - which may (or may not) include the fact of already being of the identity datatype.

Hope this brings fresh air to the so called debate - I know it's almost impossible to end it


_____________________________________
Pablo (Paul) Berzukov

Author of Understanding Database Administration available at Amazon and other bookstores.

Disclaimer: Advice is provided to the best of my knowledge but no implicit or explicit warranties are provided. Since the advisor explicitly encourages testing any and all suggestions on a test non-production environment advisor should not held liable or responsible for any actions taken based on the given advice.
Post #1004295
Posted Thursday, October 14, 2010 6:44 AM
SSC-Enthusiastic

SSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-Enthusiastic

Group: General Forum Members
Last Login: Yesterday @ 3:23 AM
Points: 112, Visits: 297
- Defined as IDENTITY because the column is guaranteed to be unique within the table
That's not even correct.
Post #1004311
Posted Thursday, October 14, 2010 6:52 AM


Hall of Fame

Hall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of Fame

Group: General Forum Members
Last Login: Tuesday, January 28, 2014 8:15 AM
Points: 3,068, Visits: 4,639
hallidayd (10/14/2010)
- Defined as IDENTITY because the column is guaranteed to be unique within the table
That's not even correct.


please do elaborate.


_____________________________________
Pablo (Paul) Berzukov

Author of Understanding Database Administration available at Amazon and other bookstores.

Disclaimer: Advice is provided to the best of my knowledge but no implicit or explicit warranties are provided. Since the advisor explicitly encourages testing any and all suggestions on a test non-production environment advisor should not held liable or responsible for any actions taken based on the given advice.
Post #1004320
Posted Thursday, October 14, 2010 6:57 AM
SSC-Enthusiastic

SSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-Enthusiastic

Group: General Forum Members
Last Login: Yesterday @ 3:23 AM
Points: 112, Visits: 297
Sorry - don't want to take this off tack, but a column defined as IDENTITY is not "guaranteed to be unique within the table". Only a unique index\ constraint (or some hacky trigger\ UDF) can guarantee that.
Post #1004330
Posted Thursday, October 14, 2010 7:02 AM
SSC-Enthusiastic

SSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-Enthusiastic

Group: General Forum Members
Last Login: Yesterday @ 3:23 AM
Points: 112, Visits: 297
Actually, I'm not taking it entirely off tack. My correction is born of a similar observation\ pet peeve as yours. Far too often I see primary key\ IDENTITY\ Autonumber (in the case of Access) used as though interchangeable. They are, of course, not.
Post #1004334
Posted Thursday, October 14, 2010 7:05 AM
SSC-Addicted

SSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-Addicted

Group: General Forum Members
Last Login: Yesterday @ 8:14 AM
Points: 446, Visits: 3,329
hallidayd (10/14/2010)
- Defined as IDENTITY because the column is guaranteed to be unique within the table
That's not even correct.


Ouch, yes that is a bit of a glaring error. I've seen production databases with duplicate values in IDENTITY columns, which had nasty consequences. Feedback sent.


David
Post #1004336
« Prev Topic | Next Topic »

Add to briefcase ««1234»»»

Permissions Expand / Collapse