Understanding of Unique Index

  • I appreciate it might be a trifling question for most of you, but...

    I have a primary key on columns A, B and C (three columns in total). Due to performance issue when joined to another table, I have consulted our DBA's how to improve the performance
    and was advised to create a non clustered index on columns A,B,C,D,E and F (six columns in total, including three from the primary key). I have asked why not to have this as a
    index unique (in ADDITION to the primary key, NOT as a replacement), as a combination of columns A,B,C will always be unique anyhow, hence adding more columns will not affect the uniqueness of that record in any way. 
    Their reply was as below and I am not sure I understand it - if I have, for example, a unique record for vehicle type = 1, vehicle code = 200, material used = 300 how come adding 
    date of use and person working on that vehicle columns will affect its uniqueness - there is only one record for vehicle type = 1, vehicle code = 200 and material used = 300?

    Much appreciated! 

    /***** Quote ********/
    If we create a unique index of all the required columns i.e. A,B, C, D, E and F.
    The index will not allow any duplicate values for columns D, E and F. If anyone true to insert value in D, E and F columns
    and if it is duplicate it will throw an error due to unique index.

    If we create a non cluster index with all the required columns the query will find all the required column
    on one index which will reduce the cost and since A,B and C are primary key columns they will not allow any duplicate value,
    but for remaining columns duplicate values are allowed.

    Unique Indexes are mostly used for columns which are not primary key column but still you dont want any duplicate values in that columns.
    so, if try to insert duplicate value in that column it will not allow you to do so due to unique index.

  • Maybe the DBA's didn't quite understand what you were asking?  You are correct however if you already knowing certain columns will be unique adding more columns to that combination will always still be unique.  So from that perspective there is no reason to add a unique constraint on the index, however SQL Server will build query plans differently if it knows the index is unique, maybe the DBA's didn't realize that or they thought you wanted A B C to be unique but also D E F to be unique separately?

  • Piling on a little. Yes, if the first three columns make the data unique, then you can duplicate the data in DEF as long as ABC is different each time. The DBAs are off on this one.

    Bigger question, If you have a primary key on ABC and an index on ABCDEF (unique is better if possible), then you may not always see that second index being used by the query optimizer (depending on the queries in question). The reason is, one of the determining factors for which index to use is the histogram. The histogram is built on the first column of an index only. You have to indexes with identical histograms (potentially) since they're built off the same column. Further, because ABC is smaller, it's also going to be more attractive for some operations to the optimizer. I'd validate, by looking at the execution plan, that this index is getting used before you add it to the table (there is overhead with adding indexes).

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • Grant Fritchey - Monday, November 6, 2017 4:12 PM

    Piling on a little. Yes, if the first three columns make the data unique, then you can duplicate the data in DEF as long as ABC is different each time. The DBAs are off on this one.

    Bigger question, If you have a primary key on ABC and an index on ABCDEF (unique is better if possible), then you may not always see that second index being used by the query optimizer (depending on the queries in question). The reason is, one of the determining factors for which index to use is the histogram. The histogram is built on the first column of an index only. You have to indexes with identical histograms (potentially) since they're built off the same column. Further, because ABC is smaller, it's also going to be more attractive for some operations to the optimizer. I'd validate, by looking at the execution plan, that this index is getting used before you add it to the table (there is overhead with adding indexes).

    Yes, I think this was the reason DBA's came with the suggestion to have that index with 6 columns, in addition to the existing primary key, from the first place.

    May I ask why to have that additional index and not, for example, to drop the primary key and to create only one unique index with columns A,B and C and to INCLUDE() columns D, E and F, please?
    Wouldn't that approach to improve the performance as well, instead of having both primary key and that new index?

  • BOR15K - Tuesday, November 7, 2017 2:56 AM

    Grant Fritchey - Monday, November 6, 2017 4:12 PM

    Piling on a little. Yes, if the first three columns make the data unique, then you can duplicate the data in DEF as long as ABC is different each time. The DBAs are off on this one.

    Bigger question, If you have a primary key on ABC and an index on ABCDEF (unique is better if possible), then you may not always see that second index being used by the query optimizer (depending on the queries in question). The reason is, one of the determining factors for which index to use is the histogram. The histogram is built on the first column of an index only. You have to indexes with identical histograms (potentially) since they're built off the same column. Further, because ABC is smaller, it's also going to be more attractive for some operations to the optimizer. I'd validate, by looking at the execution plan, that this index is getting used before you add it to the table (there is overhead with adding indexes).

    Yes, I think this was the reason DBA's came with the suggestion to have that index with 6 columns, in addition to the existing primary key, from the first place.

    May I ask why to have that additional index and not, for example, to drop the primary key and to create only one unique index with columns A,B and C and to INCLUDE() columns D, E and F, please?
    Wouldn't that approach to improve the performance as well, instead of having both primary key and that new index?

    More information about the table would be useful at this point. Is the PK clustered or not, and if not, what's the table clustered on? How many columns / how wide is the table?
    Probably best answered by posting the DDL for the table along with indexes.

    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • BOR15K - Tuesday, November 7, 2017 2:56 AM

    Yes, I think this was the reason DBA's came with the suggestion to have that index with 6 columns, in addition to the existing primary key, from the first place.

    May I ask why to have that additional index and not, for example, to drop the primary key and to create only one unique index with columns A,B and C and to INCLUDE() columns D, E and F, please?
    Wouldn't that approach to improve the performance as well, instead of having both primary key and that new index?

    I can only answer that question with speculation. Situationally, I could conceive of doing what you say. Situationally, I could conceive why that would be a poor choice. Not seeing your data structures, the data distribution, other indexes on the table, and, most importantly, the queries in question, I can't give you an intelligent answer here. Best thing I can say is, it might be worth investigating.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • Grant Fritchey - Tuesday, November 7, 2017 5:10 AM

    I can only answer that question with speculation. Situationally, I could conceive of doing what you say. Situationally, I could conceive why that would be a poor choice. Not seeing your data structures, the data distribution, other indexes on the table, and, most importantly, the queries in question, I can't give you an intelligent answer here. Best thing I can say is, it might be worth investigating.

    You are right, of course. I will investigate myself first.
    Thank you

  • Yeah, your "DBAs" are dead wrong, and don't seem to understand indexing at even a basic level, which is a tough situation for you (and all the dbs they control!).

    There could in theory be good reason(s) for building an additional index like that, but it seems extraordinarily unlikely, given that A,B,C is unique anyway.

    Much more likely is a need for a nonclus index with keys of ( A, B, C ) and included columns of ( D, E, F ).  That's especially common if the table has a wide row, that is, the row has a (very) large number of bytes.  That, too, is quite common, often caused by a lack of design / normalization.

    SQL DBA,SQL Server MVP(07, 08, 09) A socialist is someone who will give you the shirt off *someone else's* back.

  • BOR15K - Tuesday, November 7, 2017 2:56 AM

    May I ask why to have that additional index and not, for example, to drop the primary key and to create only one unique index with columns A,B and C and to INCLUDE() columns D, E and F, please?
    Wouldn't that approach to improve the performance as well, instead of having both primary key and that new index?

    If the primary key is the clustered index, then all of the columns are "included" since the clustered index stores the whole table.  The only advantage to having a separate index with just these 6 columns in it (all 6 the key or 3 key and 3 include) is if the query only references columns within that set of 6, then the optimizer thinks it's more efficient to do less reads against the new index than against the clustered index which also has more columns G, H, etc.  It doesn't take into consideration though that the pages it needs of the clustered index may already be in memory so it wouldn't need to read from disk, and the fact that you would then have 2 copies of essentially the same data in memory (once for each index).  I've seen people do things like this because they saw it improve performance on one query, without considering the effects on the entire workload, and I've had to try to talk them out of it.

    If the primary key is not the clustered index, you still can't create a primary key with include columns.  You could as you suggested remove both and only use a single unique index with include columns, since you can tie a foreign key from another table to the unique index.

    Either way as others have already mentioned, if the combination of A,B,C is unique, adding more columns does NOT make it less unique, it is still unique.

  • As already identified by others, I think what you need is a covering index which could be the current unique index and includes the fields that are required by the query.  Be mindful though that this does come at a cost, effectively the data in covered fields is duplicated and therefore your storage costs will increase and any changes in these fields will require that the index is modified with an additional write cost.  Your DBAs may be paranoid about altering the existing key as it will likely have to be dropped and re-created to add the fields. otherwise updating it in-situ could cause different performance issues as the pages fragment if there is no space for the additional fields.

    Adding more indexes may increase your performance but if the statistics are off then other queries MIGHT try to use the index when it is not appropriate (the query optimiser is not infallible) and additional indexes will ALWAYS slow down writes.

Viewing 10 posts - 1 through 9 (of 9 total)

You must be logged in to reply to this topic. Login to reply