January 26, 2010 at 2:28 pm
I recently had a post about a deadlock issue and since then I have been taking this issue pretty seriously even though I have fixed the deadlock.
Background:
I currently work with a database that has about 8 client applications potentially connected to it, and a web site that could have N clients.
The database does quite a bit of inserts/updates and a frequent purge everyday that purges about 10-25% of the database at once.
I have a table that could have maybe 60-100K of rows in it at a time.
By default SQL Server 2k5, and 8 will create a clustered index for a primary key. My primary key currently is a guid.
This table gets used in views that will join with other tables based on the primary key. I can easily see keeping this guid as the clustered index because the tables that it will be joining on will be a subset of what is in this [Files] table.
I want to make a statement now to make sure that what I am thinking is correct :
Because the data is physically sorted based on the clustered key, using this column as a join will be faster so that the clustered index can point directly to the data in the pages.
My question is, does this hold true even for a guid column?
The conflicting information that I am currently seeing is that typically a nonclustered index would want to be defined on a column that is more selective.
One of the columns in this table is a bit column called 'Error' and another column that is another guid that is not unique [ObjectId]. When turning on the Execution Plan, it says that I need a Non-Clustered index on the Error and ObjectId column.
I get better performance out of the table if I include the Error column first before the ObjectId column. Why would that be for something that isn't really selective?
January 27, 2010 at 1:12 am
Some of what factors into the performance and which column to list first comes from the query and how it is written. Could you provide the query and the execution plan? This information would help the forum to better answer the questions at hand.
Jason...AKA CirqueDeSQLeil
_______________________________________________
I have given a name to my pain...MCM SQL Server, MVP
SQL RNNR
Posting Performance Based Questions - Gail Shaw[/url]
Learn Extended Events
January 27, 2010 at 6:49 am
CirquedeSQLeil (1/27/2010)
Some of what factors into the performance and which column to list first comes from the query and how it is written. Could you provide the query and the execution plan? This information would help the forum to better answer the questions at hand.
The performance is main affected in this query, which happens to be a view:
select ...., ...., Files.Error
...., ....
from table
left join Files on
Files.FileId = table.LatestFileId
about 6 more joins...
left join Files as Files2 on
Files2.EditionObjectId = AnotherTable.Id and Files2.Error = 1
group by
...., ...., Files.Error
...., ....,
January 28, 2010 at 8:06 am
We cannot really help you without lots more information, and true help may be beyond the scope of forum postings/responses.
However, I can say unequivocally that GUIDs are HORRIBLE for clustered primary keys for several reasons.
1) huge datatype
2) this 16 bytes gets carried as part of every non-clustered index, bloating all of them
3) MASSIVE fragmentation due to GUIDs dropping in throughout the range of values. This can be partially mitigated by using a sequential guid.
Best,
Kevin G. Boles
SQL Server Consultant
SQL MVP 2007-2012
TheSQLGuru on googles mail service
January 28, 2010 at 8:15 am
kingscriber (1/26/2010)
Because the data is physically sorted based on the clustered key, using this column as a join will be faster so that the clustered index can point directly to the data in the pages.
Data is logically sorted in the order of the clustered index. There are several other factors that influence physical order of data in the file or on disk.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
January 28, 2010 at 8:17 am
TheSQLGuru (1/28/2010)
We cannot really help you without lots more information, and true help may be beyond the scope of forum postings/responses.However, I can say unequivocally that GUIDs are HORRIBLE for clustered primary keys for several reasons.
Understandable. However maybe you can answer this for me. If the purpose for a non-clustered index is mainly for higher selectability (more unique or distinct) then why would the execution plan suggest that I have one on a bit field? It is either a 1 or a 0. That I would think is horrible selectability. I would think it would be more suited for something like a phone number.
I often wondered about the uniqueidentifier being bad for a primary key clustered index. It makes sense due to how SQL Sequentially orders the pages physically.
Do you have a suggestion on migrating to something else? Such as a sequential integer or something like that? What is more popular?
January 28, 2010 at 8:32 am
kingscriber (1/28/2010)
If the purpose for a non-clustered index is mainly for higher selectability (more unique or distinct) then why would the execution plan suggest that I have one on a bit field?
Because it's often very useful in combination with other columns. Without seeing your query or the exec plan, that's all I can really suggest.
This series (3 parts) is maybe worth reading: http://www.sqlservercentral.com/articles/Indexing/68439/
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
January 28, 2010 at 8:34 am
GilaMonster (1/28/2010)
Because it's often very useful in combination with other columns. Without seeing your query or the exec plan, that's all I can really suggest.This series (3 parts) is maybe worth reading: http://www.sqlservercentral.com/articles/Indexing/68439/
Yes it did require another field, which was another guid that did have a higher selectability. Which I suppose makes sense.
Thank you for the link. Will do !
January 28, 2010 at 8:36 am
kingscriber (1/28/2010)
TheSQLGuru (1/28/2010)
We cannot really help you without lots more information, and true help may be beyond the scope of forum postings/responses.However, I can say unequivocally that GUIDs are HORRIBLE for clustered primary keys for several reasons.
Understandable. However maybe you can answer this for me. If the purpose for a non-clustered index is mainly for higher selectability (more unique or distinct) then why would the execution plan suggest that I have one on a bit field? It is either a 1 or a 0. That I would think is horrible selectability. I would think it would be more suited for something like a phone number.
I often wondered about the uniqueidentifier being bad for a primary key clustered index. It makes sense due to how SQL Sequentially orders the pages physically.
Do you have a suggestion on migrating to something else? Such as a sequential integer or something like that? What is more popular?
Say you have a billion rows in a table with a bit field. 99.999999% of them have a value of 0, the rest have a value of 1. Do you think having that bit field indexed would be selective now??
Best,
Kevin G. Boles
SQL Server Consultant
SQL MVP 2007-2012
TheSQLGuru on googles mail service
January 28, 2010 at 8:39 am
Have a look at this blog post as well, since it addresses the selectivity question.
http://sqlinthewild.co.za/index.php/2009/01/19/index-columns-selectivity-and-equality-predicates/
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
January 28, 2010 at 9:07 am
>>Do you have a suggestion on migrating to something else? Such as a sequential integer or something like that? What is more popular?
Deciding on an indexing strategy is certainly NOT a popularity contest. 🙂 It is constructed based on your data, hardware, data access patterns, etc. And determining what is best goes way beyond what you can effectively get via some forum posts. I recommend you get a performance tuning professional to assist/mentor you in this issue.
Best,
Kevin G. Boles
SQL Server Consultant
SQL MVP 2007-2012
TheSQLGuru on googles mail service
January 28, 2010 at 9:16 am
TheSQLGuru (1/28/2010)
>>Do you have a suggestion on migrating to something else? Such as a sequential integer or something like that? What is more popular?Deciding on an indexing strategy is certainly NOT a popularity contest. 🙂 It is constructed based on your data, hardware, data access patterns, etc. And determining what is best goes way beyond what you can effectively get via some forum posts. I recommend you get a performance tuning professional to assist/mentor you in this issue.
LOL...You know, I can't blame you for stating that. From the last week of doing research on these topics...I can easily see why you say that. I suppose that was a dumb question. My apologies.
I am afraid that I am going to have to become that performance tuning professional. I am not foreign to SQL, just foreign to SQL from a DBA perspective. After the past week I have realized that this DB went in one direction when it should have gone in another.
I have SAMS SQL Server 2000 Unleashed and all the Ebooks from the site downloaded as a start.
Thank you for your help. I am sure I will be back 😉
January 28, 2010 at 9:19 am
kingscriber (1/28/2010)
I am afraid that I am going to have to become that performance tuning professional. I am not foreign to SQL, just foreign to SQL from a DBA perspective.
If you have no experience in tuning indexes/queries, I would also suggest that you get someone in. Sure, you can learn it yourself but it's a slow process of trial and error. Better to get someone in that knows they stuff so they can help you and teach you at the same time.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
January 28, 2010 at 11:00 am
TheSQLGuru (1/28/2010)
>>Do you have a suggestion on migrating to something else? Such as a sequential integer or something like that? What is more popular?Deciding on an indexing strategy is certainly NOT a popularity contest. 🙂 It is constructed based on your data, hardware, data access patterns, etc. And determining what is best goes way beyond what you can effectively get via some forum posts. I recommend you get a performance tuning professional to assist/mentor you in this issue.
I would also throw in that you could keep your primary key on the GUID. However, your clustered index is not required to be on that same key. This is where knowing your data, queries, and schema really comes into play. Thus as Gail and Guru have said, you could learn it (through trial and error) or bring somebody in that knows how to do it and have it done faster.
Jason...AKA CirqueDeSQLeil
_______________________________________________
I have given a name to my pain...MCM SQL Server, MVP
SQL RNNR
Posting Performance Based Questions - Gail Shaw[/url]
Learn Extended Events
Viewing 14 posts - 1 through 14 (of 14 total)
You must be logged in to reply to this topic. Login to reply