Indexing VARCHAR. Does Length Matter ?

  • We have some TXT file imports into SQL tables. As a default, the columns are set to varchar(500) to match the incoming definition. However, the data will never be 500.

    Is there a penalty to indexing a VARCHAR(500) column, if the max length will be 50, compared to defining a VARCHAR(50) column and migrating the data to that, and indexing the VARCHAR(50) column ?

    • This topic was modified 1 year, 10 months ago by  homebrew01.
    • This topic was modified 1 year, 10 months ago by  homebrew01.
  • Yes, there is a penalty, because SQL bases its memory allocation on the column lengths.  If you think about it, that actually makes perfect sense, as presumably longer columns would tend to use more memory.

    SQL DBA,SQL Server MVP(07, 08, 09) Prosecutor James Blackburn, in closing argument in the Fatal Vision murders trial: "If in the future, you should cry a tear, cry one for them [the murder victims]. If in the future, you should say a prayer, say one for them. And if in the future, you should light a candle, light one for them."

  • Also, there is a 900 byte max key length that you have to stay within.

    The simple fact is, it's going to be harder to index 500 characters than 50 because it's longer. Yes, 420 characters will be more selective than 42, but, your histogram is going to have to cover a lot more territory accurately to make the index reflect reality. Data skew is certainly likely with much larger text fields which can lead to bad plans as the row count estimates are wrong. As narrow as possible on the keys for indexes is a good general rule. It's not carved in a stone tablet, but I do try to follow it. However, if you need to index 500 characters, and testing shows it helps, index 500 characters.

    ----------------------------------------------------
    The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood...
    Theodore Roosevelt

    The Scary DBA
    Author of: SQL Server 2017 Query Performance Tuning, 5th Edition and SQL Server Execution Plans, 3rd Edition
    Product Evangelist for Red Gate Software

  • What I'm wondering is if the data is only 50 characters, the remaining 450 characters will be blank. So is an index on basically the first 50 characters of a 500 field worse than an index on a 50 character field ?

    • This reply was modified 1 year, 10 months ago by  homebrew01.
  • Well, then it won't matter. If the data will ALWAYS be less than 50 characters then indexing a VARCHAR(50) or a VARCHAR(500) will be the same. It's completely dependent on the data. However, you can't guarantee that the data will be less than 50 because you're allowing 500.

    ----------------------------------------------------
    The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood...
    Theodore Roosevelt

    The Scary DBA
    Author of: SQL Server 2017 Query Performance Tuning, 5th Edition and SQL Server Execution Plans, 3rd Edition
    Product Evangelist for Red Gate Software

  • Thanks for the replies.

  • "Also, there is a 900 byte max key length that you have to stay within"

    In SQL Server 2016 this has been increased to 1700 bytes for non-clustered indexes, see https://blogs.msdn.microsoft.com/sqlserverstorageengine/2016/04/26/increased-nonclustered-index-key-size-with-sql-server-2016/

    • This reply was modified 1 year, 10 months ago by  William Rayer.
    • This reply was modified 1 year, 10 months ago by  William Rayer.

Viewing 7 posts - 1 through 7 (of 7 total)

You must be logged in to reply to this topic. Login to reply