Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 
        
Home       Members    Calendar    Who's On


Add to briefcase

Using the ROWGUIDCOL attribute on primary key Expand / Collapse
Author
Message
Posted Wednesday, September 14, 2005 4:42 AM
SSC-Enthusiastic

SSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-Enthusiastic

Group: General Forum Members
Last Login: Monday, September 16, 2013 12:25 AM
Points: 182, Visits: 7
Hi!

I'm using unqiueidentifiers as my primary keys for all tables with a default value of NEWID(). Are there any advantages/disadvantages of adding the ROWGUIDCOL attribute on the primary key column?

TIA

Jonas




Brgds

Jonas
Post #219520
Posted Wednesday, September 14, 2005 11:54 AM


Ten Centuries

Ten CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen Centuries

Group: General Forum Members
Last Login: Wednesday, July 23, 2014 8:16 AM
Points: 1,035, Visits: 410

My first response is why in the world are you using guids on "all tables" for a primary key?

Assuming you have your reasons... The ROWGUIDCOL property doesn't really add much, other than you can add to the overall confusion by referencing the column by using "ROWGUIDCOL" instead of the actual column name in your queries, but it doesn't hurt either. 




/*****************

If most people are not willing to see the difficulty, this is mainly because, consciously or unconsciously, they assume that it will be they who will settle these questions for the others, and because they are convinced of their own capacity to do this. -Friedrich August von Hayek



*****************/
Post #219760
Posted Thursday, September 15, 2005 10:52 AM


SSCommitted

SSCommittedSSCommittedSSCommittedSSCommittedSSCommittedSSCommittedSSCommittedSSCommitted

Group: General Forum Members
Last Login: Sunday, July 20, 2014 11:55 AM
Points: 1,945, Visits: 2,860

BY DEFINITION, this is not a key at all. 

 

I have a taxonomy of key types.  We have a lot of problems with terminology on this one, so let me get that out of the way. 

There is no such thing as a "universal, one-size-fits-all" key.  Just as no two sets of entities are the same, the attributes that make them unique have to be found in the reality of the data.  Here is my classification of types of keys: 

                             natural artificial exposed surrogate
==================================================================
Constructed from reality    |  
of the data model           |  Y         N         N         Y
                            |
verifiable in reality       |  Y         N         N         N
                            |
verifiable in itself        |  Y         Y         N         N
                            |
visible to the user         |  Y         Y         Y         N

1) A natural key is a subset of attributes which occur in a table and act as a unique identifier.  They are seen by the user.  You can go to the external reality and verify them.  you would also like to have some validation rule.  Example: UPC codes on consumer goods (read the package barcode) and validate them with a check digit or a manufacturer's website, geographical co-ordinates (get a GPS). 

2) An artificial key is an extra attribute added to the table which is seen by the user.  It does not exist in the external reality, but can be verified for syntax or check digits inside itself.  It is up to the DBA to maintain a trusted source for them.  Example: the open codes in the UPC scheme which a user can assign to his own stuff.  The check digits still work, but you have to verify them inside your own enterprise.

If you have to construct a key yourself, it takes time to deisgn them, to invent a validation rule, set up audit trails, etc.

3) An "exposed physical locator" is not based on attributes in the data model and is exposed to user.  There is no reasonable way to predict it or verify it, since it usually comes fromt he physical state of the hardware at the time of data insertion.  The system obtains a value thru some physical process in the storage hardware totally unrelated to the logical data model. Example: IDENTITY columns, other proprietary, non-relaitonal auto-numbering devices. 

Technically, these are not really keys at all, sinc they are attributes of the PHYSICAL storage and are not even part of the LOGICAL data model.  But they are handy for lazy, non-RDBMS programmers who don't want to research or think!  This is the worst way to program in SQL. 

4) A surrogate key is system generated to replace the actual key behind the covers where the user never sees it.  It is based on attributes in the table.  Example: Teradata hashing algorithms, indexes, pointer chains, ADABASE numbers, etc. 

The fact that you can never see it or use it for DELETE and UPDATE or create it for INSERT is vital. When users can get to them, they will screw up the data integrity by getting the real keys and these physical locators out of synch.  The system must maintain them. 

**  Notice that people get "exposed physical locator" and surrogate mixed up; they are totally different concepts. **

An appeal to authority, with a quote from Dr. Codd: "..Database users may cause the system to generate or delete a surrogate, but they have no control over its value, nor is its value ever displayed to them ..."(Dr. Codd in ACM TODS, pp 409-410) and Codd, E. (1979), Extending the database relational model to capture more meaning.  ACM Transactions on Database Systems, 4(4).  pp. 397-434. 

This means that a surrogate ought to act like an index; created by the user, managed by the system and NEVER seen by a user.  That means never used in queries, DRI or anything else that a user does.

Codd also wrote the following:

"There are three difficulties in employing user-controlled keys as permanent surrogates for entities.

(1)  The actual values of user-controlled keys are determined by users and must therefore be subject to change by them (e.g. if two companies merge, the two employee databases might be combined with the result that some or all of the serial numbers might be changed.).

(2)  Two relations may have user-controlled keys defined on distinct domains (e.g. one uses social security, while the other uses employee serial numbers) and yet the entities denoted are the same.

(3)  It may be necessary to carry information about an entity either before it has been assigned a user-controlled key value or after it has ceased to have one (e.g. and applicant for a job and a retiree).

These difficulties have the important consequence that an equi-join on common key values may not yield the same result as a join on common entities.  A solution - proposed in part [4] and more fully in [14] - is to introduce entity domains which contain system-assigned surrogates.  Database users may cause the system to generate or delete a surrogate, but they have no control over its value, nor is its value ever displayed to them....." (Codd in ACM TODS, pp 409-410).

References

Codd, E. (1979), Extending the database relational model to capture more meaning.  ACM Transactions on Database Systems, 4(4).  pp. 397-434

The steps for finding a key are

1) Look for an industry standard and the trusted external source that maintains and verifies it. I count this as a natural key, but you could argue that it is artificial. 

2) Look for a natural key in the attributes.  Example: (longitude, latitude) makes a good key for a geographical location. A GPS can be used to verify it. 

3) If you must design a new identifier, plan it carefully -- especially if people will see and use it.  You have to be able to verify it in application programs, so you should have a regular expression, other syntax rules, check digits.  You have to be able to be verify in the reality of the model or with a trusted source that you maintain. 

Validation means the format is good -- "This could one of our invoice numbers because it is 7 digits long, passes a Bull code check digit and begins with  { '01', '02', '07', '99'}"

Verification means that it references a real entity -- "This is a real invoice because I can look it up in Accounts Payable and trace its approval back to Cindy Lu Who on 2005-02-12." 
 

Also look at Codd's rules 8-11 to see why these things cannot be relational.  I just did an exmapel of #11 (11. Distribution independence) in the VIN and  DVM example. 

 



Books in Celko Series for Morgan-Kaufmann Publishing
Analytics and OLAP in SQL
Data and Databases: Concepts in Practice
Data, Measurements and Standards in SQL
SQL for Smarties
SQL Programming Style
SQL Puzzles and Answers
Thinking in Sets
Trees and Hierarchies in SQL
Post #220250
Posted Thursday, September 15, 2005 3:26 PM
Valued Member

Valued MemberValued MemberValued MemberValued MemberValued MemberValued MemberValued MemberValued Member

Group: General Forum Members
Last Login: 2 days ago @ 1:34 AM
Points: 56, Visits: 173

According to Books Online "The ROWGUIDCOL property is primarily used by SQL Server replication."

The same page (Using uniqueidentifier Data) goes on to give some reasons not to use UniqueIdentifiers.

 




Best Regards,

Otto Schreibke
solwiz.ch
Post #220405
Posted Thursday, January 22, 2009 7:45 AM
Forum Newbie

Forum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum Newbie

Group: General Forum Members
Last Login: Thursday, July 23, 2009 6:23 AM
Points: 1, Visits: 7
Joe Celko (9/15/2005)


BY DEFINITION, this is not a key at all.

I have a taxonomy of key types. We have a lot of problems with terminology on this one, so let me get that out of the way.

There is no such thing as a "universal, one-size-fits-all" key. Just as no two sets of entities are the same, the attributes that make them unique have to be found in the reality of the data. Here is my classification of types of keys:

natural artificial exposed surrogate
==================================================================
Constructed from reality |
of the data model | Y N N Y
|
verifiable in reality | Y N N N
|
verifiable in itself | Y Y N N
|
visible to the user | Y Y Y N

1) A natural key is a subset of attributes which occur in a table and act as a unique identifier. They are seen by the user. You can go to the external reality and verify them. you would also like to have some validation rule. Example: UPC codes on consumer goods (read the package barcode) and validate them with a check digit or a manufacturer's website, geographical co-ordinates (get a GPS).

2) An artificial key is an extra attribute added to the table which is seen by the user. It does not exist in the external reality, but can be verified for syntax or check digits inside itself. It is up to the DBA to maintain a trusted source for them. Example: the open codes in the UPC scheme which a user can assign to his own stuff. The check digits still work, but you have to verify them inside your own enterprise.

If you have to construct a key yourself, it takes time to deisgn them, to invent a validation rule, set up audit trails, etc.

3) An "exposed physical locator" is not based on attributes in the data model and is exposed to user. There is no reasonable way to predict it or verify it, since it usually comes fromt he physical state of the hardware at the time of data insertion. The system obtains a value thru some physical process in the storage hardware totally unrelated to the logical data model. Example: IDENTITY columns, other proprietary, non-relaitonal auto-numbering devices.

Technically, these are not really keys at all, sinc they are attributes of the PHYSICAL storage and are not even part of the LOGICAL data model. But they are handy for lazy, non-RDBMS programmers who don't want to research or think! This is the worst way to program in SQL.

4) A surrogate key is system generated to replace the actual key behind the covers where the user never sees it. It is based on attributes in the table. Example: Teradata hashing algorithms, indexes, pointer chains, ADABASE numbers, etc.

The fact that you can never see it or use it for DELETE and UPDATE or create it for INSERT is vital. When users can get to them, they will screw up the data integrity by getting the real keys and these physical locators out of synch. The system must maintain them.

** Notice that people get "exposed physical locator" and surrogate mixed up; they are totally different concepts. **

An appeal to authority, with a quote from Dr. Codd: "..Database users may cause the system to generate or delete a surrogate, but they have no control over its value, nor is its value ever displayed to them ..."(Dr. Codd in ACM TODS, pp 409-410) and Codd, E. (1979), Extending the database relational model to capture more meaning. ACM Transactions on Database Systems, 4(4). pp. 397-434.

This means that a surrogate ought to act like an index; created by the user, managed by the system and NEVER seen by a user. That means never used in queries, DRI or anything else that a user does.

Codd also wrote the following:

"There are three difficulties in employing user-controlled keys as permanent surrogates for entities.

(1) The actual values of user-controlled keys are determined by users and must therefore be subject to change by them (e.g. if two companies merge, the two employee databases might be combined with the result that some or all of the serial numbers might be changed.).

(2) Two relations may have user-controlled keys defined on distinct domains (e.g. one uses social security, while the other uses employee serial numbers) and yet the entities denoted are the same.

(3) It may be necessary to carry information about an entity either before it has been assigned a user-controlled key value or after it has ceased to have one (e.g. and applicant for a job and a retiree).

These difficulties have the important consequence that an equi-join on common key values may not yield the same result as a join on common entities. A solution - proposed in part [4] and more fully in [14] - is to introduce entity domains which contain system-assigned surrogates. Database users may cause the system to generate or delete a surrogate, but they have no control over its value, nor is its value ever displayed to them....." (Codd in ACM TODS, pp 409-410).

References

Codd, E. (1979), Extending the database relational model to capture more meaning. ACM Transactions on Database Systems, 4(4). pp. 397-434

The steps for finding a key are

1) Look for an industry standard and the trusted external source that maintains and verifies it. I count this as a natural key, but you could argue that it is artificial.

2) Look for a natural key in the attributes. Example: (longitude, latitude) makes a good key for a geographical location. A GPS can be used to verify it.

3) If you must design a new identifier, plan it carefully -- especially if people will see and use it. You have to be able to verify it in application programs, so you should have a regular expression, other syntax rules, check digits. You have to be able to be verify in the reality of the model or with a trusted source that you maintain.

Validation means the format is good -- "This could one of our invoice numbers because it is 7 digits long, passes a Bull code check digit and begins with { '01', '02', '07', '99'}"

Verification means that it references a real entity -- "This is a real invoice because I can look it up in Accounts Payable and trace its approval back to Cindy Lu Who on 2005-02-12."


Also look at Codd's rules 8-11 to see why these things cannot be relational. I just did an exmapel of #11 (11. Distribution independence) in the VIN and DVM example.


you need to get a life!
Post #641726
Posted Tuesday, January 31, 2012 10:54 AM
Forum Newbie

Forum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum Newbie

Group: General Forum Members
Last Login: Tuesday, March 25, 2014 6:29 AM
Points: 2, Visits: 28
lukin26 (1/22/2009)
[quote]Joe Celko (9/15/2005)

<snip>

you need to get a life!


You need to get an education. He just gave you one for free. Thank the man.
Post #1244607
Posted Thursday, July 18, 2013 3:06 PM
SSC Journeyman

SSC JourneymanSSC JourneymanSSC JourneymanSSC JourneymanSSC JourneymanSSC JourneymanSSC JourneymanSSC Journeyman

Group: General Forum Members
Last Login: Thursday, July 10, 2014 2:29 AM
Points: 85, Visits: 193
Joe Celko is one of the top authority in all things SQL - not just SQL Server - SQL! Newbie you are.


Post #1475228
Posted Thursday, July 18, 2013 3:16 PM


SSChampion

SSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampion

Group: General Forum Members
Last Login: Yesterday @ 3:21 PM
Points: 13,083, Visits: 11,918
For the record this thread is 8 years old.

_______________________________________________________________

Need help? Help us help you.

Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.

Need to split a string? Try Jeff Moden's splitter.

Cross Tabs and Pivots, Part 1 – Converting Rows to Columns
Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs
Understanding and Using APPLY (Part 1)
Understanding and Using APPLY (Part 2)
Post #1475229
« Prev Topic | Next Topic »

Add to briefcase

Permissions Expand / Collapse