Edit a Clustered index

  • Hi all,

    I have made a script changing the clustered index in our databases on all tables from the GUID-column (Primary key on this column) to a new integer column that is auto incremented (id int IDENTITY(1,1) NOT NULL). This operation was made to reduce performance issues as well as deadllocks in the databases. The Primary key will continue to be on the GUID column but as non-clustered primary key.

    The Deadlocks were reduced but no significat performance was gained. So I was a bit dissapointed.

    Looking onto the new clustered index i have seen that in my script i forgot the word 'Unique' when creating it. This is what im talking about: "CREATE UNIQUE CLUSTERED INDEX [CI_Table] ON Table (Column ASC)"

    Here's my question:

    Would I gain even better performance and/or reduce Deadlock's by creating a new script changing these clustered index on the tables they were created on by changing them to unique clustered index, as they of course, should have been from the beginning ?

    Best regards, mickegohle

  • There are unfortunately no answers to broad and generic questions like this.

    As a rule of thumb, giving SQL Server more information is always better - not just for performance (your current database will not throw an error when a duplicate value is accidentally entered). It is likely that the performance of your database will increase if you explicitly declare the indexes as unique. It is unlikely that performance will suffer. However, without knowing the specifics of your database design and code, there's no way to tell for sure.

    As for deadlocks, it is very unlikely that any change to data types or to index types will change that significantly, most deadlocks are caused by non-optimal coding patterns in the application code.

    I am also very unhappy whenever I read posts that say that they did something for "all tables". They are not all used the same, so why do you all design them the same?

    A good database designer will ask hirself these questions:

    1. Does this table need a surrogate key at all? (You seem to have skipped this question, because you are only plpaying with the data type of and index type on the surrogate key. My experience is that at least 50% of all tables that are designed with a surrogate key do not need it, do not benefit from it, and in a lot of cases would even be better without it).

    1.a. If the table needs a surrogate key, what should the data type be?

    2. Which columns and which combinations fo columns are candidate keys in this table? Which columns and column combinations are good candidates for nonunique indexes in this table? Which of the candidate keys and nonunique index candidates is the best choice for the clustered index? (As a rule of thumb, I'd suggest always having a clustered index on each table, and always having an index for each candidate key - preferably the one that is auto-created when you define the constraint. For nonunique index candidates there is a bit more wiggle room for deciding not to implement all).

    Each of these questions will have different answers for different tables, so you do not ask them once per database, but once per table. If that sounds like a lot of work to you, then you are right - it is. Nobody ever said that this would be easy.


    Hugo Kornelis, SQL Server/Data Platform MVP (2006-2016)
    Visit my SQL Server blog: https://sqlserverfast.com/blog/
    SQL Server Execution Plan Reference: https://sqlserverfast.com/epr/

  • Yes you will see a great improvement in performance by specifying UNIQUE for a CLUSTERED index. Back in sql 6.5 I saw a 4X improvement. Not sure if it still holds true today. The tighter the constraint, the better sql can optimize. Without uniqueness, sql is forced to use rowids to find rows on a page.

    Personally I think you could get rid of the clustering altogether and see better performance with a heap. Guys here hate heaps but I love em.

  • Changing the clustered key from a ramdom GUID to a sequenial integer will help prevent page splits, so that that helps reduce I/O and blocking during inserts. It appears that at least has proven true.

    Whether it will improve performance of a specific query depends; for example a full table scan still reads every page in the table regardless of how it's clustered. One thing to consider is that the clustering key is used as the table's row identifier, and it's also included for bookmarks in non-clustered indexes. If the clustering key is not declared as unique, then SQL Server will append an additional 4 byte uniqueifier to make it unique, which consumes more space in index pages and is less efficient for joins than a simple integer without a prefix. So, I would consider adding the UNIQUE declaration to the clustered index to be worthwhile.

    Also, look at your execution plans and determine if you can lever non-clustered indexes wherever it's currenting using table scans. Ideally a query will be covered by a non-clustered index in which case it doesn't read from the base table.

    This article explains why clustering keys should be narrow, static, sequential, and unique.

    https://www.simple-talk.com/sql/learn-sql-server/effective-clustered-indexes/

    "Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho

  • Bill Talada (1/21/2016)


    Without uniqueness, sql is forced to use rowids to find rows on a page.

    Not true. Without a clustered index (i.e., in a heap), SQL Server needs to use the RID to find rows. But there is, as far as SQL Server is concerned, no need for any uniqueness. (However, beware that this is not only a violation of one of Codd's rules, the basic principles of relational databases, but also a huge invitation to duplicated data).

    Personally I think you could get rid of the clustering altogether and see better performance with a heap. Guys here hate heaps but I love em.

    Heaps can be okay in some very specific circumstances. But you really need to know what issues they can cause and how to fix or prevent them, and most people don't. As a rule of thumb, my recommendation is to always have a clustered index on every table, unless you really know what a heap is and how it behaves.

    A generic advice to get rid of clustering altogether is terrible, and I hope that nobody will take that as a serious recommendation.


    Hugo Kornelis, SQL Server/Data Platform MVP (2006-2016)
    Visit my SQL Server blog: https://sqlserverfast.com/blog/
    SQL Server Execution Plan Reference: https://sqlserverfast.com/epr/

  • Hugo Kornelis (1/21/2016)


    Bill Talada (1/21/2016)


    Without uniqueness, sql is forced to use rowids to find rows on a page.

    Not true. Without a clustered index (i.e., in a heap), SQL Server needs to use the RID to find rows. But there is, as far as SQL Server is concerned, no need for any uniqueness.

    Not true. 🙂

    https://msdn.microsoft.com/en-us/library/ms177484.aspx

    If the clustered index is not a unique index, SQL Server makes any duplicate keys unique by adding an internally generated value called a uniqueifier.

    So, SQL Server requires uniqueness to correctly identify every record.

    If creates unique add-on's to the clustered index you created where it fails to satisfy the requirement.

    _____________
    Code for TallyGenerator

  • Sergiy (2/4/2016)


    Hugo Kornelis (1/21/2016)


    Bill Talada (1/21/2016)


    Without uniqueness, sql is forced to use rowids to find rows on a page.

    Not true. Without a clustered index (i.e., in a heap), SQL Server needs to use the RID to find rows. But there is, as far as SQL Server is concerned, no need for any uniqueness.

    Not true. 🙂

    https://msdn.microsoft.com/en-us/library/ms177484.aspx

    If the clustered index is not a unique index, SQL Server makes any duplicate keys unique by adding an internally generated value called a uniqueifier.

    So, SQL Server requires uniqueness to correctly identify every record.

    If creates unique add-on's to the clustered index you created where it fails to satisfy the requirement.

    I don't know why you feel a desire to revive a two-week old topic just in order to contradict me. But if you do such a thing, then at least read what I wrote before you do.

    My "not true" was a direct response to Bill's statement that "without uniqueness (in any type of index - HK), sql is forced to use rowids to find rows on a page".

    Your post, while containing only absolutrely correct statements about nonunique clustered indexes, is absolutely unrelated to this statement.


    Hugo Kornelis, SQL Server/Data Platform MVP (2006-2016)
    Visit my SQL Server blog: https://sqlserverfast.com/blog/
    SQL Server Execution Plan Reference: https://sqlserverfast.com/epr/

  • Well, if you go back to my original post you will see that I was talking about a unique clustered index.

    ...but everyone on this board is blind and has to save their ego and posture and make everyone else "not true".

Viewing 8 posts - 1 through 7 (of 7 total)

You must be logged in to reply to this topic. Login to reply