Index Tuning Week: How Many Indexes Are Too Many?

Indexing
34 Comments

This week I’m running a series of posts introducing the fundamentals of index tuning. First up, let’s talk about understanding how many indexes make sense on a table.

The more nonclustered indexes you have on a table,
the slower your inserts and deletes will go.

It’s that simple.

If you have a table with 10 nonclustered indexes, that’s 10x (or more) writes an insert has to do for all of the data pages involved with each index. (The data page writes happen later, asynchronously, during a checkpoint, but that’s still more work for storage, and there’s still synchronous blocking involved.)

I’ve seen so many DBAs that scream bloody murder about 15-second IO warnings on their storage, and then when we go to look at their tables, they have dozens of indexes on every table. We remove the unneeded indexes, and presto, suddenly their storage workload drops. Their storage does less work, which means it can react to other requests faster, and suddenly they have a cascading performance improvement throughout their queries.

People weren’t complaining about the speed of inserts and deletes – but those background operations were killing storage with a thousand small cuts.

To help performance tuners, I came up with Brent’s 5 and 5 Rule: aim for around 5 indexes per table, with around 5 columns (or less) on each.

This is not set in stone. It’s simply based on the fact that I have 5 fingers on one hand, and 5 fingers on my other hand, so it’s really easy to remember.

Brent’s 5 and 5 Rule

5 and 5 is just a starting point for discussion.

Sometimes, 5 indexes aren’t enough. Sometimes your users want to filter or sort by lots of different fields on a really wide table. However, the more indexes you add, the slower your inserts and deletes will go, and the more competition pages will have for precious memory space. You may have to throw hardware at it in the form of more memory and faster storage.

Sometimes, even just 5 indexes are too many. When you have a table where insert and delete speeds are absolutely critical, and select speeds don’t matter, then you can increase performance by cutting down on your indexes.

Sometimes, 5 columns are too many. If you’re indexing big VARCHAR(MAX) columns, or you’re index hot columns (columns that constantly change), you can still run into problems. And yes, included columns count against your total – if that surprises you, watch my free How to Think Like the Engine class to learn how pages are built.

Sometimes, 5 columns aren’t enough. Maybe you’ve got a query that has 3 fields in the filter, the combination of which is extremely selective, and they just need to return 4 fields in the SELECT. That’d be 7 fields. That’s completely okay – remember, Brent’s 5 and 5 Rule just stems from the fact that I have 5 fingers on my left hand, and 5 fingers on my right hand. If I’d have had an accident in shop class, we might be talking about Brent’s 4 and 5 Rule, and laughing about my lack of skill with the table saw.

Wanna learn how to pick which indexes you should build, and what columns should go in ’em? That’s where my Mastering Index Tuning Class comes in.

Previous Post
Pour One Out for Database Diagrams, and Good Riddance
Next Post
LGBTQ Meetup at the PASS Summit

34 Comments. Leave new

  • I’m laughing at the image of the developers for our vendor app having 20 fingers on one hand and 30 on the other

    Reply
  • Tom Wickerath
    October 8, 2018 4:52 pm

    “The more nonclustered indexes you have on a table, the slower your inserts and deletes will go. It’s that simple.”
    Updates will also be negatively affected, as there are two index modifications (a deletion + an insert) for each data update operation.

    Reply
    • Tom – well, not exactly. Updates are only involved if you’re updating fields contained in the index. (I left that out on purpose – we dive into that more deeply in the class.)

      Reply
      • Tom Wickerath
        October 9, 2018 3:08 pm

        Yes, I was implying fields contained within the index.
        I think you could make the same statement about inserts and deletes….e.g. Inserts/Deletes are *only* involved if you are inserting/deleting data in a column contained in a non-clustered index. It is certainly possible to create non-clustered indexes on columns that allow NULL. I don’t think an insertion or deletion of a row in that case would result in an index needing to be updated in that case. Am I mistaken?

        Reply
        • Tom – why not test your idea? It’ll only take a matter of seconds using the writes column in sp_BlitzIndex.

          Reply
          • Tom Wickerath
            October 9, 2018 4:47 pm

            I just ran an experiment, as you suggested. A bit of a delay, as I had to first download and install sp_BlitzIndex, and create/populate a table with some data. I guess one cannot make the same statement about inserts and deletes. Me thinks this might be because a non-clustered index includes a clustered index’s key as the row locator. 🙂

          • TA-DAH, so awesome. Next watch the How to Think Like the Engine videos. They’re totally free and they’ll open your eyes to a lot of stuff.

        • And to think through it a little deeper when you’re doing the experiment – if you wanted to find all the rows where a column is null, would SQL Server use an index on that column? If so, how would that mechanically work? I love stuff like this – if you haven’t watched How to Think Like the SQL Server Engine, take some time today to watch that. You’ll find a lot of neat stuff in there.

          Reply
          • Tom Wickerath
            October 9, 2018 3:35 pm

            > If so, how would that mechanically work?
            I wasn’t addressing whether or not SQL Server would *use* an index, as part of compiling a query plan. I was only addressing your statement:
            “The more nonclustered indexes you have on a table, the slower your inserts and deletes will go. It’s that simple.”

          • Totally understood. Now run the experiment. 😉

  • This seems like a really irresponsible guideline and encourages database administrators and engineers to reduce their involvement with properly designing a system based off of gathered requirements. It’s not only irresponsible, it’s just lazy.

    Here’s a new 5×5 rule. Every month, take 5 indexes that you can tune and 5 that you can drop (based on their usage statistics). My left hand is used for optimization. My right for destruction. This is a far more effective use of a DBA’s time and will be far more impactful than just telling someone they can’t deploy more than 5 indexes.

    Reply
    • Jason – try reading the whole post. In absolutely no way did I say “they can’t deploy more than 5 indexes.”

      Reply
      • I get that, but you have to admit that most people will interpret this as a hard and fast “Ozar” rule. I understand the spirit of what you’re trying to say, I’m just worried that others will misinterpret it and start spreading awkward “best practice” recommendations – like, “don’t worry about index fragmentation” leading to not bothering with index maintenance at all.

        For instance, I have a table with 14 indexes. Each of which covers a respective query and performs perfectly for the given requirements. According to the index usage statistics, along with the query execution statistics, this design is doing just fine. The table is not a high-insertion entity so we’re not concerned with any sort of DML related performance issues. We take the approach of investigating overall index performance, rank them by impact, and tune accordingly.

        I suppose this is actually what you’re getting at and you’re just trying to get DBAs started with how to dive in. In which case, I’m sorry for being a jerk.

        Reply
  • Now we know why Brent is so found of that Jazz Hands caricature…

    Reply
  • So you are saying that having a table with 33 indexes, one of which had 998 columns in it is a bad idea?

    Reply
  • I’ve seen you mention this loose rule before and it gives normally me some pause. We run a bespoke ORM kind of business layer in our application where we have many objects that represent a row in a table, but they might be retrieved but other columns than the identity, so we have a small number of indexes (one per predicate) but each index will include the whole table, basically.

    The upside for us is our business objects can load quickly and have all values on a row, but the downside is any insert/update obviously is doing many more writes.

    In some instances we have different objects which are represent a search result spanning multiple tables – these normally only include a subset of the columns and fit the 5×5 rule more closely. But we can’t index for those specifically without removing the fields from the indexes that the main objects use (since the same leading column of the index is used, meaning duplication).

    Our application reads more data than it writes so trading off extra writes vs key lookups on the reads (or significant rewriting so we don’t need all the data immediately on load and can have narrow indexes) seems the right trade off to make.

    So that’s the question – are key lookups potentially as bad (or worse) than more writes?

    Reply
    • Mark – that’s where all the text below the picture comes in, and you have to read that part. Getting the exact right number of indexes in your environment involves rolling up your sleeves and doing some work. I do wish I could do that work for everyone for free, and we do as much as we can in our open source tools like sp_BlitzIndex, but for personal advice, that’s where our consulting comes in.

      Reply
      • Fair enough!

        I’m a somewhat long time reader, first time caller and it’s through your guys’ fine work that we’ve got as far as we have. Our application performance is pretty good currently but I always look for things that can be done better.

        That you didn’t ask me if I’d been drinking is encouraging.

        Reply
  • Toby Ovod-Everett
    November 2, 2018 11:42 am

    Guy Steele (in “Coders at Work”): “If I could change one thing – this is going to sound stupid – but if I could go back in time and change one thing, I might try to interest some early preliterate people in not using their thumbs when they count. It could have been the standard, and it would have made a whole lot of things easier in the modern era. On the other hand, we have learned a lot from the struggle with the incompatibility of base-ten with powers of two.”

    In other news, you may want to investigate Saw Stop table saws. The videos are pretty cool. Now if I only I could convince Microsoft to release a version of SQL Server that has a similar feature for Scalar Functions. Any time a developer tried to create a Scalar Function, SQL Server would immediately halt and make them spend $100 before they could restart the service! Then if I could somehow get all of our vendors to use that version for development, I’d be so much happier.

    Reply
  • The TL:DR for this article is, it depends. I work on an ERP system that has 10 or 15 indexes on most tables out of the box, and still experience poor in performance on some queries or get hints for missing indexes. With an app like that we can’t remove them due to hints in the code. Its something we have to live with and need to be creative to gain better performance.

    Reply
  • Start by staying on top of what existing indexes are not used or indexes where the write to read ratio is high. The use cases and data access patterns for an application can drift over time, making once useful indexes obsolete. Still, what can be potentially worse than an update with too many indexes – is an update without a covering index, which results in frequent and extended periods of blocking.

    Reply
  • What about a scenario where you have a date table. Something super static that has zero inserts. We insert like 100 years worth of dates in it and there it sits. I see easily 30 indexes on it for different range combinations (last/first day of year, last/first day of month, last/first day of week, etc.) Sure, many of them are rarely used, but is there really any harm in a table with 36K rows in it with no updates having so many indexes? I figure, small table, no updates, no harm.

    Reply
  • I have recently created 40 indices on a table, some with up to 15 columns, and i am content. It usually only updates over-night once a month, but it is constantly used for ad hoc queries. I am very happy with its performance and only now becoming grateful that had I those extra hands grafted on.

    (NB This is a fatuous comment and is not intended to encourage reckless cosmetic surgery, merely to demonstrate that the rules are more like guidelines – always think about your precise requirements when applying them)

    Reply
  • Ervin Santiago
    April 21, 2022 1:51 pm

    Just started as the DBA with a company and looking at SQL Jobs that run long and found one that is just doing a simple update to a table but taking nearly 30 minutes. I dug into the database from their software vendor, i found the table being updated has 32, yes 32 indexes. Ran Sp_BlitzIndex on the table and around 10 have 0 reads, 4 or 5 are almost identical..shoot me now please

    Reply
  • What happens with indexing a JSONB column?

    Reply
  • I’m going to have to do some deeper research, but since the Indexes I use, are advised and built by the Index Advisor, I don’t believe they are readable with field data. With this being the case, I also don’t believe that it takes the same amount of time to update one record of the index, as compared to updating a field in my table. We are not 6NF, which you would need to be very close to, for your 1:1 writing speed to apply.

    Any idea what the actual write speed to an Index is, compared to writing to a table? An IBM SQL guru would be great to hear from.

    Reply

Leave a Reply

Your email address will not be published. Required fields are marked *

Fill out this field
Fill out this field
Please enter a valid email address.