Workaround maximum length for a nonclustered index in SQL Server

  • astrid 69000

    Hall of Fame

    Points: 3145

    Hi, hope everyone is well.

    I need to create a unique nonclustered index but that key lenght is 17oo bytes.

    I know the work around is to create the index with include.

    My question is: would that index with unique be unique? Will an insert fail if it is breaking the logic of the unique nonclustered index with include?

     

    Thanks

    Astrid

  • Lynn Pettis

    SSC Guru

    Points: 442285

    Maximum length for a nonclustered index in SQL Server 2017 is 1700 bytes. Since you have not provided enough information it is hard to say what you problem may be.  Perhaps if you provided the full text of the error message and/or the DDL for the table and the index you are trying to create we could provide a better answer.

    Edit:  This is for the actual indexed columns.  You can add additional columns that are not a part of the indexed columns as included columns.

     

     

    • This reply was modified 1 week, 1 day ago by  Lynn Pettis.
  • Jeffrey Williams

    SSC Guru

    Points: 88347

    astrid 69000 wrote:

    My question is: would that index with unique be unique? Will an insert fail if it is breaking the logic of the unique nonclustered index with include?

    No - included columns are not a part of the index or unique constraint.  I am curious about how you ended up with a table where the only way to identify a unique row is 1700 characters long.  How many columns are a part of this index - and what are the data types of each column?

    What does this table represent?

    Jeffrey Williams
    Problems are opportunities brilliantly disguised as insurmountable obstacles.

    How to post questions to get better answers faster
    Managing Transaction Logs

  • astrid 69000

    Hall of Fame

    Points: 3145

    Thanks for the reply. This is the table I need to make - we follow the datavault.

    I am able to build the table but I get a warning:

    Warning! The maximum key length for a nonclustered index is 1700 bytes. The index 'UC_HubHistoryDataNew' has maximum length of 1802 bytes. For some combination of large values, the insert/update operation will fail.

    This is the table code.

    CREATE TABLE [Raw].[HubHistoryDataNew](
    [HistoryDataHashKey] [char](40) NOT NULL,
    [LoadDate] [datetime] NULL,
    [RecordSource] [nvarchar](200) NULL,
    [LastSeenDate] [datetime] NULL,
    [CreateDate] [datetime2](7) NULL,
    [UserID] [nvarchar](20) NULL,
    [TableName] [nvarchar](50) NULL,
    [ColumnName] [nvarchar](50) NULL,
    [PrimaryKey] [nvarchar](254) NULL,
    [Action] [nvarchar](15) NULL,
    [OldValue] [nvarchar](254) NULL,
    [NewValue] [nvarchar](254) NULL,
    CONSTRAINT [HubHistoryDataNew_PK] PRIMARY KEY NONCLUSTERED
    (
    [HistoryDataHashKey] ASC
    )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, FILLFACTOR = 80) ON [INDEX],
    CONSTRAINT [UC_HubHistoryDataNew] UNIQUE NONCLUSTERED
    (
    [CreateDate] ASC,
    [UserID] ASC,
    [TableName] ASC,
    [ColumnName] ASC,
    [PrimaryKey] ASC,
    [Action] ASC,
    [OldValue] ASC,
    [NewValue] ASC
    )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, FILLFACTOR = 80) ON [INDEX]
    ) ON [DATA]
    GO
  • frederico_fonseca

    SSChampion

    Points: 14475

    if you are following Data Vault design then you have too many columns on that hub. most of the columns there look like attributes of your business key and are not, as far as I can think of, a business key on its own.

     

    Also, and if you can, don't define the columns as nvarchar - unless you use data that is unicode based stick with varchar types as it will save space.

    edit: and looking at the columns this looks like it should be a link table (split into link and satellite) rather than a hub.

  • astrid 69000

    Hall of Fame

    Points: 3145

    This is a special table, that the column are the composition of that primary hash key. That is why are all there.

    I could check about checking between nvarchar and varchar.

    Thanks

  • Jeff Moden

    SSC Guru

    Points: 996046

    Speaking of hashes, you could create a hash column for all the columns and make that the Clustered PK.  That would certainly be within the limits of a CI.

     

     

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column.
    "If you think its expensive to hire a professional to do the job, wait until you hire an amateur."--Red Adair
    "Change is inevitable... change for the better is not."

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • astrid 69000

    Hall of Fame

    Points: 3145

    i did not choose the type of table it is, I don't even know what the table is for. I was just asked to create a hub

  • astrid 69000

    Hall of Fame

    Points: 3145

    Jeff Moden wrote:

    Speaking of hashes, you could create a hash column for all the columns and make that the Clustered PK.  That would certainly be within the limits of a CI.

    what do you mean? we have already hash all the columns to create a hash.

  • astrid 69000

    Hall of Fame

    Points: 3145

    Speaking of hashes, you could create a hash column for all the columns and make that the Clustered PK.  That would certainly be within the limits of a CI.

    what do you mean? we have already hash all the columns to create a hash.

  • Jeff Moden

    SSC Guru

    Points: 996046

    astrid 69000 wrote:

    Speaking of hashes, you could create a hash column for all the columns and make that the Clustered PK.  That would certainly be within the limits of a CI.

    what do you mean? we have already hash all the columns to create a hash.

    Ah... I see.  I missed the fact that you posted the code for the table.

    There's no way to make an index with 1700 bytes of keys without a possibility of future failure, period, because everyone forgets about the row-headers within the indexes.   Even on 2017, it's likely to come up a bit short because of that (although I've not tested it specifically because a 1700 byte keyed index isn't practical, IMHO).  You'll need to identify the minimum of what makes the index unique and use INCLUDE for the rest if you want to come even close.  Certainly the old and new value columns are not needed to be included in the keys.

    The real key here (no pun intended) is that this is a column based AUDIT/HISTORY table and, ostensibly, it's being fed only by triggers (if you're using CLR TRIGGERs, that's a HUGE performance mistake... obviously, I've been there and done that and it needs to be fixed if you have those or anything dynamic within your triggers and there's a way around that).  That being said, you really shouldn't need a UNIQUE index on the columns that you're trying to include as keys.  At best, you only need the TableName, ColumnName, PrimaryKey, and CreateDate for keys and, with the understanding that it's a huge and wasteful duplication of data, use INCLUDE for the other columns.

    Since we have also have this form of auditing at work, I can tell you that your trigger is also being bogged down by converting old and new values to NVARCHAR().  This is one of the very few places in the world where SQL_Variant datatype works a treat (there are some minor caveats but works perfect for everything I need such an audit table for).

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column.
    "If you think its expensive to hire a professional to do the job, wait until you hire an amateur."--Red Adair
    "Change is inevitable... change for the better is not."

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Lynn Pettis

    SSC Guru

    Points: 442285

    Looking at the table you are creating I have to agree, this is not a Hub table in the DataVault methodology. A HUB table will have the natural business key; if using DataVault 2.0 it will also have an MD5 hash of the business key which will be used as the primary key in the hub, satellite, and link tables; the source system, the date first seen, the date deleted or last seen.  It will have no other data.  All other data will be in one or more satellite tables and possibly in link tables.

    Also, using the DataVault methodology there would be no reason to have columns to hold old and new values.  The DataVault methodology is an insert only into the raw data vault even when a physical delete of data occurs in any of the source systems.

    If you need to insure uniqueness across all the columns you have listed as index columns you can use CHECKSUM, BINARY_CHECKSUM, or HASHBYTES using all those columns and store that as a persisted column and index on that column.  Just remember that you have to consider the data, especially the character data.  For example, HASHBYTES will treat the character data as case sensitive data when it hashes the data.  This means "SEPTEMBER" and "September" and "September" will generate different hashes.

     

  • frederico_fonseca

    SSChampion

    Points: 14475

    astrid 69000 wrote:

    i did not choose the type of table it is, I don't even know what the table is for. I was just asked to create a hub

    Then go back to who asked you to create it and get full and correct definition. Under DataVault Methodology you do need to follow the rules - otherwise you just creating a mess.

    without doing a full and proper analysis, and based just on the table definition you supplied the following would be a possible representation of the data under DV method.

    and i'm stil not happy with the link representation - and Table entry most likely could also be split into table further tables (hub for table, hub for column, link between them - potentially another hub for PrimaryKey and yet another link to join all together)

    Hub - User
    UserHash
    BusinessKey
    -- UserID

    Hub - TableEntry
    TableEntryHash
    BusinessKey
    --TableName
    --ColumnName
    --PrimaryKey

    Link - TableEntryChanges
    LinkHash
    UserHash
    TableEntryHash

    Link Sat
    LinkHash
    LinkAttributes
    --CreateDate
    --Action
    --OldValue
    --NewValue
  • frederico_fonseca

    SSChampion

    Points: 14475

    And on a different note - there is no need for that extra unique index - if the hash is done correctly and it is based on those columns then it is already unique by definition as you set it as Primary Key.

    and when accessing the table to see if a record already exists it should ALWAYS go there with the hash, never the individual columns, so that index would not be really that useful

  • scdecade

    Say Hey Kid

    Points: 710

    The other constraint on the size of unique keys is the number of columns, which I believe is 16.  Bigint is only 8 bytes.  So to the extent all of the OP's nvarchar columns could be substituted by surrogate keys... it's nowhere near the limit of what Sql Server can handle.

    Aus dem Paradies, das Cantor uns geschaffen, soll uns niemand vertreiben können

Viewing 15 posts - 1 through 15 (of 15 total)

You must be logged in to reply to this topic. Login to reply