Introduction to Indexes: Part 2 – The clustered index

  • hlam-1032421 (11/11/2009)


    With comments like this I run for the exit:

    … good clustering. One school says to put the clustered index on the column or set of columns that would be most useful for queries, either frequently run queries or ones doing queries of large ranges of data. The other school says to use the clustered index primarily to organise the table and leave data access to the nonclustered indexes.I hold to the second school…

    when you're on a conference call with a bunch of VP's and developers and everyone is blaming SQL and emailing notes like SQL is only a single user database then it's time to start breaking some rules to fix performance issues

  • bob 54859 (11/11/2009)


    Ok,

    So... here is a question we have debated in our shop for quite a while.

    The question is...

    Given a table where they is no good candidate for a clustered index, for example we use client generated GUIDs for primary keys and most of the data does not meet the for criteria you outline above... Which approach should we take:

    1. Keep the table as a heap.

    2. Create an identity field on the table which is used for no other reason that to create a clustered index upon it.

    I believe Kim Tripp recommended number 2 in a podcast iirc. however, we have a dev here who is adamant that this is a mistake.

    I'm am scared that the answer will be "it depends".

    HELP!

    Thanks,

    BOb

    we do #2 all the time. sometimes we'll get a table structure from the devs with no PK. we'll kick it back and ask where we can create a PK or just an indentity column to serve a PK? we have a lot of tables like this and don't have any problems.

  • John Colby (11/12/2009)


    I had been (up until reading this article) creating a clustered index that was also a cover index, i.e. the index had every single field.

    The clustered index is always covering, because it has the entire row in the leaf level.

    Obviously a WIDE index, however I do not need any other index in this case.

    I disagree. An index on (Zip5, Zip4) is only useful (for index seeks) when there's a filter on Zip5, alone or in combination with Zip4. If there's a query that filters on just Zip4, that index cannot be used for a seek operation. If you want an index seek, you'd need a separate index on Zip4.

    Also, given the lack of future inserts, is there any reason NOT to create the index with the Zip5 / Zip4 as the leading fields in the index. I assume that would make the data physically sort on zip instead of physically sort on PK. Again I join on PK but use where and order by on zip.

    If you join on the pk, then you may need an index with a leading column of the pk. Usually the order for columns in an index (when used for seeks and order by) would be (columns used in where/join, columns specified in order by)

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • John Colby (11/12/2009)


    I routinely work with tables where the entire table is a PK, name and address and five other fields. these are just mailing lists, but HUGE - to me anyway - 50 million names is typical.

    The tables are address validated monthly, other than that they are stable, no insertions or deletions ever, only the monthly update. The monthly address validation updates about 1 to 2% of the records monthly.

    I had been (up until reading this article) creating a clustered index that was also a cover index, i.e. the index had every single field. Obviously a WIDE index, however I do not need any other index in this case. Because there is never an insertion or deletion, that discussion goes away.

    So is there any reason (in THIS case) NOT to do things this way?

    Also, given the lack of future inserts, is there any reason NOT to create the index with the Zip5 / Zip4 as the leading fields in the index. I assume that would make the data physically sort on zip instead of physically sort on PK. Again I join on PK but use where and order by on zip.

    Or should I still create the clustered index on PK (Unique) and then add a cover index of every field (they are all used in various processing).

    Gail, I have to agree - "run for the doors" comments are unhelpful and in poor taste. "Expound in detail or write your own article about why".

    You say you "do not need any other index". So you are saying you NEVER search this table on ANYTHING other than left-to-right-column order of the clustered primary key??

    Best,
    Kevin G. Boles
    SQL Server Consultant
    SQL MVP 2007-2012
    TheSQLGuru on googles mail service

  • INNOCENT GUMBO

    You will notice in my post about multiple clustered indexes, I put an upside-down smiley at the end. I meant my comment/question as tongue-in-cheek. 😛

    Best,
    Kevin G. Boles
    SQL Server Consultant
    SQL MVP 2007-2012
    TheSQLGuru on googles mail service

  • SQL Noob (11/12/2009)


    bob 54859 (11/11/2009)


    Ok,

    So... here is a question we have debated in our shop for quite a while.

    The question is...

    Given a table where they is no good candidate for a clustered index, for example we use client generated GUIDs for primary keys and most of the data does not meet the for criteria you outline above... Which approach should we take:

    1. Keep the table as a heap.

    2. Create an identity field on the table which is used for no other reason that to create a clustered index upon it.

    I believe Kim Tripp recommended number 2 in a podcast iirc. however, we have a dev here who is adamant that this is a mistake.

    I'm am scared that the answer will be "it depends".

    HELP!

    Thanks,

    BOb

    we do #2 all the time. sometimes we'll get a table structure from the devs with no PK. we'll kick it back and ask where we can create a PK or just an indentity column to serve a PK? we have a lot of tables like this and don't have any problems.

    So, the dev in our shop that is against this claimed that when he did this a select query was 10x slower and SQL chose the wrong index. I didn't see how that was possible but he claimed this was the reason.

    Perhaps if you add an identity and cluster on it you have to rebuild the indexes? Or will SQL rebuild the indexes automatically?

    (yes I asked for the db and query in question and it is not forth comming)

    Thanks,

    BOb

  • bob 54859 (11/12/2009)


    So, the dev in our shop that is against this claimed that when he did this a select query was 10x slower and SQL chose the wrong index. I didn't see how that was possible but he claimed this was the reason.

    Get details, including execution plans before ans after

    Perhaps if you add an identity and cluster on it you have to rebuild the indexes? Or will SQL rebuild the indexes automatically?

    See the very first reply I made in this thread.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • TheSQLGuru

    I can see that. Sorry about that

  • GilaMonster (11/12/2009)


    bob 54859 (11/12/2009)


    So, the dev in our shop that is against this claimed that when he did this a select query was 10x slower and SQL chose the wrong index. I didn't see how that was possible but he claimed this was the reason.

    Get details, including execution plans before ans after

    Perhaps if you add an identity and cluster on it you have to rebuild the indexes? Or will SQL rebuild the indexes automatically?

    See the very first reply I made in this thread.

    Yes, right of course. And that only makes sense and is what you would expect. It seems that heap vs clustered index perf isn't major... when I get a change I will do as you suggest and test it. Of course, I would have to test every query we make against the tables.

    Of course, dev manager vs architect... you know who wins. That's ok, I have bigger fish to fry. 😉

    BOb

  • one time we got a script to create a table from development. it had a PK but no indexes. we asked if we should add any indexes to the new table. 30 minutes later we got a script that added a non-clustered index with every or almost every column from the table.

  • bob 54859 (11/12/2009)


    SQL Noob (11/12/2009)


    bob 54859 (11/11/2009)


    Ok,

    So... here is a question we have debated in our shop for quite a while.

    The question is...

    Given a table where they is no good candidate for a clustered index, for example we use client generated GUIDs for primary keys and most of the data does not meet the for criteria you outline above... Which approach should we take:

    1. Keep the table as a heap.

    2. Create an identity field on the table which is used for no other reason that to create a clustered index upon it.

    I believe Kim Tripp recommended number 2 in a podcast iirc. however, we have a dev here who is adamant that this is a mistake.

    I'm am scared that the answer will be "it depends".

    HELP!

    Thanks,

    BOb

    we do #2 all the time. sometimes we'll get a table structure from the devs with no PK. we'll kick it back and ask where we can create a PK or just an indentity column to serve a PK? we have a lot of tables like this and don't have any problems.

    So, the dev in our shop that is against this claimed that when he did this a select query was 10x slower and SQL chose the wrong index. I didn't see how that was possible but he claimed this was the reason.

    Perhaps if you add an identity and cluster on it you have to rebuild the indexes? Or will SQL rebuild the indexes automatically?

    (yes I asked for the db and query in question and it is not forth comming)

    Thanks,

    BOb

    i've had instances where i would create a dummy bigint column just for a PK and set it up as non-clustered because i needed a clustered index on another column. other times i'll take the default and set it up as a clustered PK

  • Thanks for a great article Gail. I have a question relating to this area for my first forum post, hopefully it is not too system specific. Please bear with me while I try to explain the situation.

    I have seen issues at 2 client sites running SQL 2000 where several tables in the db are showing large areas of unused space (we are talking 1 - 3Gb per table here) but will not release this space when shrinking the db. I recently read an article I cannot find now that claimed this could happen when all the indexes on a table were heaps.

    The tables in question are part of an OLTP application and these tables indeed do not have a clustered index. The article claimed creating a clustered index and then dropping it would reorganise the data such that a subsequent shrinkfile would be able to release the space. I tried this and it worked!

    My questions are:

    Have you, on anyone on the forum, seen this issue?

    Is it SQL 2000 specific?

    If the table(s) in question are of the type that have order number and possibly order line values, is leaving a clustered index on the table likely to cause issues?

    Will leaving a clustered index on the table solve the excess space problem noted above?

    Again apologies if this is far too complex a query for the forum but I thought it worth asking!

    Thanks,

    Mark.

  • I've often come across indexes that have been created in ignorance e.g. Col1, Col2, Col3 where the access (queries and/or reporting etc) were always select or join on Col1, ...Col2, ...(Col1 AND Col2) or ...(Col1 AND Col3).

    These are often better defined as compound indexes (Col1 + Col2).

    Presumably SS like most all other DBMS's makes use of a compound key index to search a table when a query is searching on columns that form the major part of a compound key so in the example above, it would be better to have indexes as [Col1 + Col2], [Col1 + Col3] and Col2. There's no benefit in index Col3 unless it's being used directly or is used in an order by as the primary sort key.

    Jim

    Trainee Novice:w00t:

  • mark.stringer (11/12/2009)


    I recently read an article I cannot find now that claimed this could happen when all the indexes on a table were heaps.

    If the table is a heap. Indexes cannot be heaps, heap is the term for a table without a clustered indexe

    Have you, on anyone on the forum, seen this issue?

    No, because I don't use heaps. I know about it though.

    Is it SQL 2000 specific?

    No

    If the table(s) in question are of the type that have order number and possibly order line values, is leaving a clustered index on the table likely to cause issues?

    Not enough information. Maybe start a new thread in an appropriate forum and ask this?

    Will leaving a clustered index on the table solve the excess space problem noted above?

    Probably, but that's not to say it won't result in other space-related 'problems'

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • Esalter (11/12/2009)


    Presumably SS like most all other DBMS's makes use of a compound key index to search a table when a query is searching on columns that form the major part of a compound key

    Left-based subset of the index key. Discussed in much more detail in part 3 (coming next week)

    so in the example above, it would be better to have indexes as [Col1 + Col2], [Col1 + Col3] and Col2.

    If filters were Col1 or Col2 or Col1 and Col2 or col1 and Col3, I'd probably just create two indexes:

    Col2, Col1

    Col1, Col3

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass

Viewing 15 posts - 46 through 60 (of 122 total)

You must be logged in to reply to this topic. Login to reply