Clustered & Duplicate Indexes

  • I have inherited a bunch of SQL Servers that never had a DBA and the app and DB design and development was done

    by the developers. Though most of the SQL code seems to be OK, there are some basic issues with the design, like

    a bunch of tables with no clustered indexes and some tables with duplicate keys. Some keys are duplicate

    in their entirety and some have same starting columns, like index1 on col1, col2, and col3, and then index2 on col1 and col2.

    As far as I know, not having a clustered index for large tables is a huge mistake, besides wasted space, inability

    to reorganize it causes performance issues too. However one of the experienced developer says, they had

    some performance issues with a clustered indexes so they chose not to use it. He could not recall the issue.

    Some of the duplicate key designs are done by folks who have since left the dept. Now before I start fixing

    stuff, I would like to get your feedback on

    if any of you see any reason for not having a clustered index ( I am not picky on having one on small tables

    but we are talking about tables with over mil records and 1 GB size)

    if there is any need to have indexes on on the same starting columns that are already indexed.

    Thanks.

  • The new Ken Henderson book on internals points out nicely that page splits cause nonclustered indexes to move because there is no "key" in the nonclustered index that points to the clustered index. Instead it points to the physical location, which moves in a page split (potentially). This can result in performance issues. Or it may not. Depends on your situation.

    So in general, a clustered index is preferred on every table. If you had performance issues (not sure you would, but it's possible), then perhaps the choice of the column for the clustered index was a poor one. In general you want a clustered index to be on some type of range or column with low selectivity. Non clustered index tend to be favored in high selectivity columns.

    Steve Jones

    sjones@sqlservercentral.com

    http://www.sqlservercentral.com/columnists/sjones

    The Best of SQL Server Central.com 2002 - http://www.sqlservercentral.com/bestof/

    http://www.dkranch.net

  • I'll agree with Steve on the Clustered indexes. I personally have a rule, rules are made to be broken though, that every table had a clustered index.

    Never make the clustered index on a sequential key, at least if there are a large number of transactions inserting at once.

    As far as the duplicate index, that is a waste, the optimizer "should" be using Index 1 for searches against Field1, Field1 and 2, or all three fields. So Index 2 is probably rarely used.

    KlK, MCSE


    KlK

  • Now I am confused. I have a master / detail situation. Master has 1mil + records and is cluster indexed on MasterId Identity Integer field. Detail has 5mil+ records and is clustered indexed on MasterId, DetailId where DetailId is also Identity Index.

    Detail is always accessed via MasterId. Large number (approx 10 thousand) of master/details are added at once but are of course added to end of each table.

    Retrieval of individual contracts (alt index on master) with their detail is very fast (10th sec) and select M.*,D.* all records is also "relatively" fast.

    Are we saying there is a better faster way to index these large table? Since we are not adding back additional records withing keys there should not be any page spliting etc.

    SmithDM


    SmithDM

  • Now I am confused. I have a master / detail situation. Master has 1mil + records and is cluster indexed on MasterId Identity Integer field. Detail has 5mil+ records and is clustered indexed on MasterId, DetailId where DetailId is also Identity Index.

    Detail is always accessed via MasterId. Large number (approx 10 thousand) of master/details are added at once but are of course added to end of each table.

    Retrieval of individual contracts (alt index on master) with their detail is very fast (10th sec) and select M.*,D.* all records is also "relatively" fast.

    Are we saying there is a better faster way to index these large table? Since we are not adding back additional records withing keys there should not be any page spliting etc.

    SmithDM


    SmithDM

  • SmithDM: For Details table the index is probably a good choice. For the master table however I don't think it gives you any good advantage, you would probably be better off with the clustered index on some other column, for instance a column that you use for sorting master.* queries.

    --

    Chris Hedgate @ Extralives (http://www.extralives.com/)

    Contributor to Best of SQL Server Central 2002 (http://www.sqlservercentral.com/bestof/)

    Articles: http://www.sqlservercentral.com/columnists/chedgate/

  • Forgot to answer the initial question: sxg6023, I also agree with Steve and kknudson. What Steve mentions, forward-pointers, can really become a performance issue. There are other reasons as well for 'always' having clustered indexes on your tables. Please see my articles on the subject here at SQL Server Central (use link in my signature below).

    --

    Chris Hedgate @ Extralives (http://www.extralives.com/)

    Contributor to Best of SQL Server Central 2002 (http://www.sqlservercentral.com/bestof/)

    Articles: http://www.sqlservercentral.com/columnists/chedgate/

Viewing 7 posts - 1 through 6 (of 6 total)

You must be logged in to reply to this topic. Login to reply