Log in
::
Register
::
Not logged in
Home
Tags
Articles
Editorials
Stairways
Forums
Scripts
Videos
Blogs
QotD
Books
Ask SSC
SQL Jobs
Training
Authors
About us
Contact us
Newsletters
Write for us
Recent Posts
Recent Posts
Popular Topics
Popular Topics
Home
Search
Members
Calendar
Who's On
Home
»
SQL Server 2008
»
SQL Server 2008 - General
»
Primary key or not to primary key, that is...
15 posts, Page 1 of 2
1
2
»»
Primary key or not to primary key, that is the question
Rate Topic
Display Mode
Topic Options
Author
Message
tafountain
tafountain
Posted Monday, March 04, 2013 10:58 AM
SSC-Enthusiastic
Group: General Forum Members
Last Login: Wednesday, June 12, 2013 9:48 AM
Points: 104,
Visits: 336
Yeah, bad taste on the subject but nonetheless, you're reading my post :). I'd like to pose a scenario and see what other developers are doing out there (or would recommend at least).
We have most of our tables defined with a PRIMARY KEY constraint over a single column that is defined as UNIQUEIDENTIFIER with a default of NEWID(). This is never clustered unless the table is small. We use this in our child tables as the foreign key reference like so (excuse the pseudo table definitions):
Table:Parent
ID UNIQUEIDENTIFIER NOT NULL DEFAULT NEWID() PRIMARY KEY,
Field VARCHAR(30) NOT NULL,
Code INT NOT NULL
Status VARCHAR(3) NOT NULL
Table:Child
SurrogateID INT IDENTITY(1, 1) NOT NULL, --has a unique clustered index defined
ID UNIQUEIDENTIFIER NOT NULL DEFAULT NEWID() PRIMARY KEY,
Field1 VARCHAR(30) NOT NULL,
ParentID UNIQUEIDENTIFIER NOT NULL (FK defined to Parent.ID)
Obviously the primary key constraint maintains an index behind the scenes. When we query between the two tables it will look soemthing like this:
SELECT P.ID, P.Field, P.Code, P.Status, C.ID, C.Field1
FROM dbo.Parent AS P
INNER JOIN dbo.Child AS C
ON P.ID = C.ParentID
Looks simple right? This query should (and does for me) do an index seek using the primary key constraint. But it also generates an expensive key lookup (clustered) over the clustered index. The only way to resolve that is to add included columns in the primary key to resolve it.
Yes... primary keys indeed *DO NOT* allow included columns to be defined. At this point I have one of two options:
- Drop the PRIMARY KEY and simply make a unique index with the necessary included columns
- Create a unique index over Parent.ID with the neccessary included columns. However this results in doubling the index storage required for the Parent.ID column.
So the question is this, do you:
- drop the primary key and create the unique index with included columns
- duplicate the primary key and create the unique index with included columns
I'm leaning towards dropping the PK and creating the unique index myself. Thoughts?
Post #1426374
Sean Lange
Sean Lange
Posted Monday, March 04, 2013 12:10 PM
SSCrazy Eights
Group: General Forum Members
Last Login: Today @ 12:04 PM
Points: 8,969,
Visits: 8,531
Just a rambling here...you seem to be somewhat concerned with storage space. You also have obviously realized the challenges of using a unique identifier as your PK. You have ended up with 2 columns of unique values. Why not just drop the unique IDs all together and go with your identity column? It is faster, easier to maintain, easier to debug queries, indexes don't get fragmented etc, etc etc...
_______________________________________________________________
Need help? Help us help you.
Read the article at
http://www.sqlservercentral.com/articles/Best+Practices/61537/
for best practices on asking questions.
Need to split a string? Try Jeff Moden's
splitter
.
Cross Tabs and Pivots, Part 1 – Converting Rows to Columns
Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs
Post #1426398
Evil Kraig F
Evil Kraig F
Posted Monday, March 04, 2013 12:43 PM
SSCertifiable
Group: General Forum Members
Last Login: Yesterday @ 5:35 PM
Points: 5,722,
Visits: 6,194
tafountain (3/4/2013)
We have most of our tables defined with a PRIMARY KEY constraint over a single column that is defined as UNIQUEIDENTIFIER with a default of NEWID().
Ow. Replicating? If so you'll have to maintain the PK on the tables. If not, why are you using a GUID?
What is the clustered on these tables that your primary join mechanism is being relegated to an NC index? Is this a rare connection as far as the tables are concerned?
This is working in the dark, really, as we don't know the rest of the expectations of your system. There're indicators here that there's a lot more in the overhead going on than just a simple schema exercise.
- Craig Farrell
Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.
For better assistance in answering your questions
|
Forum Netiquette
For index/tuning help, follow these directions.
|
Tally Tables
Twitter: @AnyWayDBA
Post #1426418
tafountain
tafountain
Posted Monday, March 18, 2013 12:31 PM
SSC-Enthusiastic
Group: General Forum Members
Last Login: Wednesday, June 12, 2013 9:48 AM
Points: 104,
Visits: 336
Sorry for the delayed response. I've been working on other items lately. Anyways, here I am :).
@Sean - yes I am concerned with storage space but you hit it on the head, this is a typical surrogate key issue. All-in-all I guess the decision is to live with the duplicate scenario as long as the storage requirements aren't simply ridiculous (for the surrogate key situation anyway). I would love to go the identity route, however we maintain data across environments and need the GUID values to support this. This might change in the future but not today.
@Evil - see the last part of my response to Sean - and at one point we did have replication setup to a reporting server but have since simplified that with a simply copy-only backup.
Post #1432306
TheSQLGuru
TheSQLGuru
Posted Tuesday, March 19, 2013 7:57 AM
Hall of Fame
Group: General Forum Members
Last Login: Today @ 10:32 AM
Points: 3,678,
Visits: 5,176
1) if you are worried with space, GUIDs have no place in your system.
2) why aren't you using NEWSEQUENTIALID??
3) Your PK on the parent does a clustered index on the NEWID, since you didn't specify one.
4) You are fragmenting the heck out of your nc indexes on those GUIDs.
5) Did I mention that GUIDs SUCK yet?!? Oh, and for multiple systems, you can almost ALWAYS come up with a simple multi-part numeric arrangement that is guaranteed to be unique across all systems. I have used a tiny/smallint ServerID and integer identity construct at numerous clients to eliminate GUIDS they "HAD to have", always with GREAT effect.
Best,
Kevin G. Boles
SQL Server Consultant
SQL MVP 2007-2012
TheSQLGuru at GMail
Post #1432662
Eugene Elutin
Eugene Elutin
Posted Tuesday, March 19, 2013 8:12 AM
SSCrazy
Group: General Forum Members
Last Login: Today @ 9:40 AM
Points: 2,596,
Visits: 4,507
2) why aren't you using NEWSEQUENTIALID??
I don't know why OP doesn't, but me...
I don't use it because it's useless! It does not guarantee sequential order after reboot as it's just a simple wrapper around UuidCreateSequential WinAPI which never had order guarantee...
BTW, I try not to use GUID's at all, however sometimes it's unavoidable. The company I'm consulting right now had bought into 3-rd party product which has
CLUSTERED
GUID's PKS in all tables.
So, I enjoy looking into index fragmentation stats every day here
...
_____________________________________________
"The only true wisdom is in knowing you know nothing"
"O skol'ko nam otkrytiy chudnyh prevnosit microsofta duh!"
(So many miracle inventions provided by MS to us...)
How to post your question to get the best and quick help
Post #1432672
TheSQLGuru
TheSQLGuru
Posted Tuesday, March 19, 2013 8:27 AM
Hall of Fame
Group: General Forum Members
Last Login: Today @ 10:32 AM
Points: 3,678,
Visits: 5,176
Eugene Elutin (3/19/2013)
2) why aren't you using NEWSEQUENTIALID??
I don't know why OP doesn't, but me...
I don't use it because it's useless! It does not guarantee sequential order after reboot as it's just a simple wrapper around UuidCreateSequential WinAPI which never had order guarantee...
BTW, I try not to use GUID's at all, however sometimes it's unavoidable. The company I'm consulting right now had bought into 3-rd party product which has
CLUSTERED
GUID's PKS in all tables.
So, I enjoy looking into index fragmentation stats every day here
...
You are absolutely wrong in your statement. It is certainly NOT useless. It avoids fragmentation for as long as the server remains up. You currently (with NEWID) get values spread throughout the range continually. That isn't just about page splits, fragmentation, less full pages, etc either. It keeps hot (i.e. recent) data in the buffer pool much more frequently as well.
I note that if you DO have GUIDs you MUST manage them, or suffer a variety of negative consequences. Some of those consequences you can't do anything about however (such as size of data) other than buying bigger hardware.
Best,
Kevin G. Boles
SQL Server Consultant
SQL MVP 2007-2012
TheSQLGuru at GMail
Post #1432683
Eugene Elutin
Eugene Elutin
Posted Tuesday, March 19, 2013 8:53 AM
SSCrazy
Group: General Forum Members
Last Login: Today @ 9:40 AM
Points: 2,596,
Visits: 4,507
TheSQLGuru (3/19/2013)
Eugene Elutin (3/19/2013)
2) why aren't you using NEWSEQUENTIALID??
I don't know why OP doesn't, but me...
I don't use it because it's useless! It does not guarantee sequential order after reboot as it's just a simple wrapper around UuidCreateSequential WinAPI which never had order guarantee...
BTW, I try not to use GUID's at all, however sometimes it's unavoidable. The company I'm consulting right now had bought into 3-rd party product which has
CLUSTERED
GUID's PKS in all tables.
So, I enjoy looking into index fragmentation stats every day here
...
You are absolutely wrong in your statement. It is certainly NOT useless. It avoids fragmentation for as long as the server remains up. You currently (with NEWID) get values spread throughout the range continually. That isn't just about page splits, fragmentation, less full pages, etc either. It keeps hot (i.e. recent) data in the buffer pool much more frequently as well.
I note that if you DO have GUIDs you MUST manage them, or suffer a variety of negative consequences. Some of those consequences you can't do anything about however (such as size of data) other than buying bigger hardware.
Ok, the "not guaranteed sequence" is not a single NEWSEQUENTIALID issue.
Actually it's doesn't guarantees even uniqueness in all cases (if your computer has no NIC installed, then GUID generetated by that function will only be unique within this computer). But, when you have NIC installed, it makes it unsecured, as it's based on the MAC address of NIC and can be predicted.
All about beauties of it is here:
http://msdn.microsoft.com/en-gb/library/windows/desktop/aa379322(v=vs.85).aspx
Another issue you will face with this if you have .NET application which also will want to generate "sequential" GUID's, you will need to come up with a special logic to match SQL Server "byte scrambling" (
http://www.jorriss.net/blog/archive/2008/04/24/unraveling-the-mysteries-of-newsequentialid
).
So, I do hold my humble opinion that the best way to manage GUID's in SQL Server database is to try avoiding them completely
And the last bit about
It avoids fragmentation for as long as the server remains up.
That is really helpful! So, as long you never reboot your server - you are fine. But if you do it at least once in its life, than you staffed, as sequential GUID's could be lower in its values than generated before reboot, so you will get your fragmentation issue back.
So, make sure you never reboot your server in order to enjoy full set of NEWSEQUENTIALID powerful features
Sorry, it's not my cup of tea...
_____________________________________________
"The only true wisdom is in knowing you know nothing"
"O skol'ko nam otkrytiy chudnyh prevnosit microsofta duh!"
(So many miracle inventions provided by MS to us...)
How to post your question to get the best and quick help
Post #1432704
tafountain
tafountain
Posted Wednesday, March 20, 2013 8:07 AM
SSC-Enthusiastic
Group: General Forum Members
Last Login: Wednesday, June 12, 2013 9:48 AM
Points: 104,
Visits: 336
TheSQLGuru,
I did not intend to spark a debate :) but here are my responses.
TheSQLGuru (3/19/2013)
1) if you are worried with space, GUIDs have no place in your system.
This statement is a little unsubstantiated. GUIDs are fine, sure they're larger than INT but they fulfull requirements that an INT cannot. For example we need to maintain uniqueness across servers. A GUID does this for us. Yes, they have drawbacks but I would not totally exclude them as an option because of them.
TheSQLGuru (3/19/2013)
2) why aren't you using NEWSEQUENTIALID??
These do not provide uniqueness across tables, let alone across servers.
TheSQLGuru (3/19/2013)
3) Your PK on the parent does a clustered index on the NEWID, since you didn't specify one.
Is this a question or a statement? Not sure what you're trying to communicate.
TheSQLGuru (3/19/2013)
4) You are fragmenting the heck out of your nc indexes on those GUIDs.
Yes - this is very true. However most of the indexes are defragmented in under 30 seconds (most under 10 seconds) during our nightly maintenance.
TheSQLGuru (3/19/2013)
5) Did I mention that GUIDs SUCK yet?!? Oh, and for multiple systems, you can almost ALWAYS come up with a simple multi-part numeric arrangement that is guaranteed to be unique across all systems. I have used a tiny/smallint ServerID and integer identity construct at numerous clients to eliminate GUIDS they "HAD to have", always with GREAT effect.
Ok, now you have my interest :). I wouldn't mind understanding your implementation.
Post #1433238
TheSQLGuru
TheSQLGuru
Posted Wednesday, March 20, 2013 8:24 AM
Hall of Fame
Group: General Forum Members
Last Login: Today @ 10:32 AM
Points: 3,678,
Visits: 5,176
tafountain (3/20/2013)
TheSQLGuru,
I did not intend to spark a debate :) but here are my responses.
TheSQLGuru (3/19/2013)
1) if you are worried with space, GUIDs have no place in your system.
This statement is a little unsubstantiated. GUIDs are fine, sure they're larger than INT but they fulfull requirements that an INT cannot. For example we need to maintain uniqueness across servers. A GUID does this for us. Yes, they have drawbacks but I would not totally exclude them as an option because of them.
TheSQLGuru (3/19/2013)
2) why aren't you using NEWSEQUENTIALID??
These do not provide uniqueness across tables, let alone across servers.
TheSQLGuru (3/19/2013)
3) Your PK on the parent does a clustered index on the NEWID, since you didn't specify one.
Is this a question or a statement? Not sure what you're trying to communicate.
TheSQLGuru (3/19/2013)
4) You are fragmenting the heck out of your nc indexes on those GUIDs.
Yes - this is very true. However most of the indexes are defragmented in under 30 seconds (most under 10 seconds) during our nightly maintenance.
TheSQLGuru (3/19/2013)
5) Did I mention that GUIDs SUCK yet?!? Oh, and for multiple systems, you can almost ALWAYS come up with a simple multi-part numeric arrangement that is guaranteed to be unique across all systems. I have used a tiny/smallint ServerID and integer identity construct at numerous clients to eliminate GUIDS they "HAD to have", always with GREAT effect.
Ok, now you have my interest :). I wouldn't mind understanding your implementation.
1) As I already said, you can (and I have several times) maintained uniqueness across multiple servers with an int identity and a separate ServerID field (tiny or small int) that is used as the compound PK for the table. Given that ServerA ALWAYS uses 1 for ServerID, ServerB ALWAYS uses 2 for ServerID, etc, you are GUARANTEED to NEVER get duplicate PK values with that 5 or 6 byte PK (instead of the SIXTEEN bytes of a GUID).
2) I believe you are wrong in stating that NEWSEQUENTIALID isn't UNIQUE. The only time it would not be is if there is no NIC on the server. Got many of those lying around?? I haven't seen one in the 20+ years I have been with databases, 15 of them as a SQL Server consultant. Here is the reference, which comes directly from SQL Server Books Online: http://msdn.microsoft.com/en-us/library/aa379322(VS.85).aspx
3) My "PK is clustered index" is a statement. If you simply define a PK on a table it is clustered by default unless you override that with explicit syntax to make it non-clustered.
4) It may not matter (much) on your system, but the extra bytes required for a GUID over other datatype(s) or compound keys REALLY make a difference on systems with any reasonable number of rows.
Best,
Kevin G. Boles
SQL Server Consultant
SQL MVP 2007-2012
TheSQLGuru at GMail
Post #1433249
« Prev Topic
|
Next Topic »
15 posts, Page 1 of 2
1
2
»»
Permissions
You
cannot
post new topics.
You
cannot
post topic replies.
You
cannot
post new polls.
You
cannot
post replies to polls.
You
cannot
edit your own topics.
You
cannot
delete your own topics.
You
cannot
edit other topics.
You
cannot
delete other topics.
You
cannot
edit your own posts.
You
cannot
edit other posts.
You
cannot
delete your own posts.
You
cannot
delete other posts.
You
cannot
post events.
You
cannot
edit your own events.
You
cannot
edit other events.
You
cannot
delete your own events.
You
cannot
delete other events.
You
cannot
send private messages.
You
cannot
send emails.
You
may
read topics.
You
cannot
rate topics.
You
cannot
vote within polls.
You
cannot
upload attachments.
You
may
download attachments.
You
cannot
post HTML code.
You
cannot
edit HTML code.
You
cannot
post IFCode.
You
cannot
post JavaScript.
You
cannot
post EmotIcons.
You
cannot
post or upload images.
Copyright © 2002-2013 Simple Talk Publishing. All Rights Reserved.
Privacy Policy.
Terms of Use.
Report Abuse.