Advice on creating unique clustered index

  • I've been studying the possibility of indexing a particular view that takes 26 minutes to run, trying to optimize it.  I'm experimenting with a test version of the view.  After reading a lot of Microsoft's documentation, there is one thing I'm unsure of.  Should I include just 1 column, or multiple columns in the CREATE UNIQUE CLUSTERED INDEX statement?  What's the difference?
    I realize this is probably a knowledge gap of mine in general on creating indexes, but I'm having trouble finding a basic explanation for this.

  • A clustered index actually defines data storage. You can have one or more columns as the key for the clustered index, but you won't get any INCLUDE columns in there because it stores the data. It's different than a non-clustered index which can have INCLUDE columns at the leaf level (making it sort of act like a clustered index).

    If you're asking if you should have two columns to make the clustered index unique... that's hard to answer without seeing the query and the structure in question. All clustered indexes are unique, either through you providing unique column(s) or by the addition of a uniquifier value (essentially an IDENTITY column providing an assurance of uniqueness).

    Generally I suggest making the clustered key the most used access path to the data (since it stores the data). If that's one column or three, it's usually the best way to go, even if they're not unique. However, the most efficient clustered index is a monotonically increasing, integer value that is unique (putting that in there because someone is going to get worked about what is the most efficient). The reason I go with the most used access path to the data versus the most efficient clustered index is because it doesn't matter how efficient the index is if it isn't used regularly. You only get one per table (with the possibility of creating a materialized view as you seem to be exploring) so you may as well make it an index that sees a lot of use.

    If I'm off the mark here, let me know.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • "All clustered indexes are unique" is not a true statement. A clustered index is any index you create as clustered, and the table is stored, at least logically, in the order of the clustered index. Unique indexes or primary keys are unique, that is they constrain the table not allowing duplicate values appearing in the index columns.
  • Joe Torre - Monday, May 15, 2017 6:02 PM

    "All clustered indexes are unique" is not a true statement. A clustered index is any index you create as clustered, and the table is stored, at least logically, in the order of the clustered index. Unique indexes or primary keys are unique, that is they constrain the table not allowing duplicate values appearing in the index columns.

    All clustered indexes are unique. That is a true statement. While you don't have to create a clustered index using the UNIQUE key word to establish that property, internally SQL Server makes that clustered index unique by the addition of a uniquifier. You can even see it in action if you use DBCC PAGE. The clustered index is unique because the data is stored at the leaf level of the clustered index and there must be a way to uniquely identify each and every row. Another aspect of the need for uniquely identifying the row comes from the fact that the nonclustered indexes refer to the clustered index key for finding the data (key lookup).

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • Personally, I would never say all clustered indexes are unique.  It is a philosophical issue but I believe Grant's statement will mislead all beginners.  Technically everything in the universe is unique.  Even non-clustered indexes represent every row in a table and therefore are unique.  Sorry Grant, no offense, but sometimes we'll disagree.

  • But if that's the case Bill then Filtered Indexes are a breach in the philosophical argument. Maybe even the space time continuum...

  • Bill Talada - Tuesday, May 16, 2017 5:42 AM

    Personally, I would never say all clustered indexes are unique.  It is a philosophical issue but I believe Grant's statement will mislead all beginners.  Technically everything in the universe is unique.  Even non-clustered indexes represent every row in a table and therefore are unique.  Sorry Grant, no offense, but sometimes we'll disagree.

    No worries. People honestly disagree all the time.

    I'll still push back a little on this one. Clustered indexes are special (one might even say unique) in that they define the storage of the data, not simply a set of keys. And yeah, technically nonclustered indexes are unique at the leaf level, but only by virtue of the fact that they have the unique key values of the clustered index stored with them (or the RID for heaps). Before writing any of this, I checked Kalen Delaney's book to validate it, and she emphasizes the unique nature of clustered indexes as well (yeah, arguing from authority sucks, but I know I screw up a lot, so I validate where I can).

    You do make a valid point though.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • I would question the need to make the indexed view unique.  Is a regular indexed view not enough?

    My biggest concern is when you attempt to insert a record into a table used in the view and the insert is rolled back because it causes a duplicate record in the view (but does not violate a constraint in the table itself).

    If you must have a unique constraint, I'd recommend you enforce it at the table level.  If the indexed view must return unique data, use SELECT DISTINCT.   I don't think enforcing a UNIQUE constraint on an indexed view is a good idea.

    Wes
    (A solid design is always preferable to a creative workaround)

  • Grant Fritchey - Tuesday, May 16, 2017 5:57 AM

    Bill Talada - Tuesday, May 16, 2017 5:42 AM

    Personally, I would never say all clustered indexes are unique.  It is a philosophical issue but I believe Grant's statement will mislead all beginners.  Technically everything in the universe is unique.  Even non-clustered indexes represent every row in a table and therefore are unique.  Sorry Grant, no offense, but sometimes we'll disagree.

    No worries. People honestly disagree all the time.

    I'll still push back a little on this one. Clustered indexes are special (one might even say unique) in that they define the storage of the data, not simply a set of keys. And yeah, technically nonclustered indexes are unique at the leaf level, but only by virtue of the fact that they have the unique key values of the clustered index stored with them (or the RID for heaps). Before writing any of this, I checked Kalen Delaney's book to validate it, and she emphasizes the unique nature of clustered indexes as well (yeah, arguing from authority sucks, but I know I screw up a lot, so I validate where I can).

    You do make a valid point though.

    In Microsoft SQL Server 2012 Internals, you'll find the argument Grant's referring to on page 316.  Either make it unique or SQL Server will do it for you.  Personally, I think it's better to make it unique yourself and not have SQL consume extra bytes on the row for you.

  • If you'll (almost) always look up by certain key value(s), and those keys are inherently terrible for clustering (such as a guid), then cluster on all of them.  Yes, in theory that makes any non-clus indexes wider, but (1) you won't come in by other key values that often and (2) you might have had to include one or more of those keys in that nc index anyway,.

    As to uniqueness, the potential issue with leaving it non-unique is that it seems to lead to ghost rows in the table (which for some reason MS docs insist on calling ghost "records").

    When the clus key is not unique, SQL has to add a "counter" to the key to make it unique.  But SQL needs a starting point for that counter.  While rows exist, SQL can use the last row to "count" from.  But when you delete rows, SQL would lose the "counter" if it allowed all rows to be deleted.  So I suspect SQL must keep a ghost row(s) to provide that counter(s).  I believe SQL can't simply start back at 1, or some lower value, because of potential rollbacks (or forward recovery?) of the db.

    SQL DBA,SQL Server MVP(07, 08, 09) A socialist is someone who will give you the shirt off *someone else's* back.

Viewing 10 posts - 1 through 9 (of 9 total)

You must be logged in to reply to this topic. Login to reply