Conflicting Information

  • I recently had a post about a deadlock issue and since then I have been taking this issue pretty seriously even though I have fixed the deadlock.

    Background:

    I currently work with a database that has about 8 client applications potentially connected to it, and a web site that could have N clients.

    The database does quite a bit of inserts/updates and a frequent purge everyday that purges about 10-25% of the database at once.

    I have a table that could have maybe 60-100K of rows in it at a time.

    By default SQL Server 2k5, and 8 will create a clustered index for a primary key. My primary key currently is a guid.

    This table gets used in views that will join with other tables based on the primary key. I can easily see keeping this guid as the clustered index because the tables that it will be joining on will be a subset of what is in this [Files] table.

    I want to make a statement now to make sure that what I am thinking is correct :

    Because the data is physically sorted based on the clustered key, using this column as a join will be faster so that the clustered index can point directly to the data in the pages.

    My question is, does this hold true even for a guid column?

    The conflicting information that I am currently seeing is that typically a nonclustered index would want to be defined on a column that is more selective.

    One of the columns in this table is a bit column called 'Error' and another column that is another guid that is not unique [ObjectId]. When turning on the Execution Plan, it says that I need a Non-Clustered index on the Error and ObjectId column.

    I get better performance out of the table if I include the Error column first before the ObjectId column. Why would that be for something that isn't really selective?

    [font="Tahoma"]forgive them, for they do not know what they do.[/font]
  • Some of what factors into the performance and which column to list first comes from the query and how it is written. Could you provide the query and the execution plan? This information would help the forum to better answer the questions at hand.

    Jason...AKA CirqueDeSQLeil
    _______________________________________________
    I have given a name to my pain...MCM SQL Server, MVP
    SQL RNNR
    Posting Performance Based Questions - Gail Shaw[/url]
    Learn Extended Events

  • CirquedeSQLeil (1/27/2010)


    Some of what factors into the performance and which column to list first comes from the query and how it is written. Could you provide the query and the execution plan? This information would help the forum to better answer the questions at hand.

    The performance is main affected in this query, which happens to be a view:

    select ...., ...., Files.Error

    ...., ....

    from table

    left join Files on

    Files.FileId = table.LatestFileId

    about 6 more joins...

    left join Files as Files2 on

    Files2.EditionObjectId = AnotherTable.Id and Files2.Error = 1

    group by

    ...., ...., Files.Error

    ...., ....,

    [font="Tahoma"]forgive them, for they do not know what they do.[/font]
  • We cannot really help you without lots more information, and true help may be beyond the scope of forum postings/responses.

    However, I can say unequivocally that GUIDs are HORRIBLE for clustered primary keys for several reasons.

    1) huge datatype

    2) this 16 bytes gets carried as part of every non-clustered index, bloating all of them

    3) MASSIVE fragmentation due to GUIDs dropping in throughout the range of values. This can be partially mitigated by using a sequential guid.

    Best,
    Kevin G. Boles
    SQL Server Consultant
    SQL MVP 2007-2012
    TheSQLGuru on googles mail service

  • kingscriber (1/26/2010)


    Because the data is physically sorted based on the clustered key, using this column as a join will be faster so that the clustered index can point directly to the data in the pages.

    Data is logically sorted in the order of the clustered index. There are several other factors that influence physical order of data in the file or on disk.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • TheSQLGuru (1/28/2010)


    We cannot really help you without lots more information, and true help may be beyond the scope of forum postings/responses.

    However, I can say unequivocally that GUIDs are HORRIBLE for clustered primary keys for several reasons.

    Understandable. However maybe you can answer this for me. If the purpose for a non-clustered index is mainly for higher selectability (more unique or distinct) then why would the execution plan suggest that I have one on a bit field? It is either a 1 or a 0. That I would think is horrible selectability. I would think it would be more suited for something like a phone number.

    I often wondered about the uniqueidentifier being bad for a primary key clustered index. It makes sense due to how SQL Sequentially orders the pages physically.

    Do you have a suggestion on migrating to something else? Such as a sequential integer or something like that? What is more popular?

    [font="Tahoma"]forgive them, for they do not know what they do.[/font]
  • kingscriber (1/28/2010)


    If the purpose for a non-clustered index is mainly for higher selectability (more unique or distinct) then why would the execution plan suggest that I have one on a bit field?

    Because it's often very useful in combination with other columns. Without seeing your query or the exec plan, that's all I can really suggest.

    This series (3 parts) is maybe worth reading: http://www.sqlservercentral.com/articles/Indexing/68439/

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • GilaMonster (1/28/2010)


    Because it's often very useful in combination with other columns. Without seeing your query or the exec plan, that's all I can really suggest.

    This series (3 parts) is maybe worth reading: http://www.sqlservercentral.com/articles/Indexing/68439/

    Yes it did require another field, which was another guid that did have a higher selectability. Which I suppose makes sense.

    Thank you for the link. Will do !

    [font="Tahoma"]forgive them, for they do not know what they do.[/font]
  • kingscriber (1/28/2010)


    TheSQLGuru (1/28/2010)


    We cannot really help you without lots more information, and true help may be beyond the scope of forum postings/responses.

    However, I can say unequivocally that GUIDs are HORRIBLE for clustered primary keys for several reasons.

    Understandable. However maybe you can answer this for me. If the purpose for a non-clustered index is mainly for higher selectability (more unique or distinct) then why would the execution plan suggest that I have one on a bit field? It is either a 1 or a 0. That I would think is horrible selectability. I would think it would be more suited for something like a phone number.

    I often wondered about the uniqueidentifier being bad for a primary key clustered index. It makes sense due to how SQL Sequentially orders the pages physically.

    Do you have a suggestion on migrating to something else? Such as a sequential integer or something like that? What is more popular?

    Say you have a billion rows in a table with a bit field. 99.999999% of them have a value of 0, the rest have a value of 1. Do you think having that bit field indexed would be selective now??

    Best,
    Kevin G. Boles
    SQL Server Consultant
    SQL MVP 2007-2012
    TheSQLGuru on googles mail service

  • Have a look at this blog post as well, since it addresses the selectivity question.

    http://sqlinthewild.co.za/index.php/2009/01/19/index-columns-selectivity-and-equality-predicates/

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • >>Do you have a suggestion on migrating to something else? Such as a sequential integer or something like that? What is more popular?

    Deciding on an indexing strategy is certainly NOT a popularity contest. 🙂 It is constructed based on your data, hardware, data access patterns, etc. And determining what is best goes way beyond what you can effectively get via some forum posts. I recommend you get a performance tuning professional to assist/mentor you in this issue.

    Best,
    Kevin G. Boles
    SQL Server Consultant
    SQL MVP 2007-2012
    TheSQLGuru on googles mail service

  • TheSQLGuru (1/28/2010)


    >>Do you have a suggestion on migrating to something else? Such as a sequential integer or something like that? What is more popular?

    Deciding on an indexing strategy is certainly NOT a popularity contest. 🙂 It is constructed based on your data, hardware, data access patterns, etc. And determining what is best goes way beyond what you can effectively get via some forum posts. I recommend you get a performance tuning professional to assist/mentor you in this issue.

    LOL...You know, I can't blame you for stating that. From the last week of doing research on these topics...I can easily see why you say that. I suppose that was a dumb question. My apologies.

    I am afraid that I am going to have to become that performance tuning professional. I am not foreign to SQL, just foreign to SQL from a DBA perspective. After the past week I have realized that this DB went in one direction when it should have gone in another.

    I have SAMS SQL Server 2000 Unleashed and all the Ebooks from the site downloaded as a start.

    Thank you for your help. I am sure I will be back 😉

    [font="Tahoma"]forgive them, for they do not know what they do.[/font]
  • kingscriber (1/28/2010)


    I am afraid that I am going to have to become that performance tuning professional. I am not foreign to SQL, just foreign to SQL from a DBA perspective.

    If you have no experience in tuning indexes/queries, I would also suggest that you get someone in. Sure, you can learn it yourself but it's a slow process of trial and error. Better to get someone in that knows they stuff so they can help you and teach you at the same time.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • TheSQLGuru (1/28/2010)


    >>Do you have a suggestion on migrating to something else? Such as a sequential integer or something like that? What is more popular?

    Deciding on an indexing strategy is certainly NOT a popularity contest. 🙂 It is constructed based on your data, hardware, data access patterns, etc. And determining what is best goes way beyond what you can effectively get via some forum posts. I recommend you get a performance tuning professional to assist/mentor you in this issue.

    I would also throw in that you could keep your primary key on the GUID. However, your clustered index is not required to be on that same key. This is where knowing your data, queries, and schema really comes into play. Thus as Gail and Guru have said, you could learn it (through trial and error) or bring somebody in that knows how to do it and have it done faster.

    Jason...AKA CirqueDeSQLeil
    _______________________________________________
    I have given a name to my pain...MCM SQL Server, MVP
    SQL RNNR
    Posting Performance Based Questions - Gail Shaw[/url]
    Learn Extended Events

Viewing 14 posts - 1 through 14 (of 14 total)

You must be logged in to reply to this topic. Login to reply