High Index Count/Space versus High CPU and Logical/Physical Reads

  • We are working on trying to reduce the high number of IO waits on one of our production servers, a large percentage of which appears to come from one of our applications (based on the queries that have the highest accumulated physical reads, other investigations we have done).

    The application writes a large amount of data into a holding table (kind of like a temp table), then runs various queries to normalize (distribute the data into smaller tables with specific data, say one table for sales info, one table for invoices/sales info, one for employee info, one table for customer info, addresses, etc.) and partitions the data (into a month/year partition, i.e., Invoices_Oct12, Invoices_Nov12, etc.).

    I realize this is one of those "it depends" kind of questions, but I need to get a sanity check before I commit these changes:

    One of the stored procedures (the one that normalizes the data) runs 18 individual queries as part of the normalization. When reviewing the execution plans, each query produced a "missing index" recommendation. Just for testing, I applied all of the recommendations (they just go on the holding table, not the normalized tables receiving the final data). The overall query time does not change significantly (nature of the inserts?), but the CPU time, scan counts, logical and physical reads definitely go down (in some cases, 94% improvement). However, this comes at the expense of the indexes consuming more space than the data (3:1 ratio).

    Question: would it make sense to apply the indexes and live with the higher space consumption (which is kind of temporary and should not really increase over time) versus the CPU and read performance?

    I am thinking the answer in this case is YES, but I would be interested in other people's opinions. 🙂

    The application does bulk inserts, so I do not believe the initial inserts into the holding/temp table would be affected (still to be tested).

    Thanks! 🙂

    Argue for your limitations, and sure enough they're yours (Richard Bach, Illusions)

  • yup it's a just depends question. My view is that you do what is best to achieve the end result from your database/application.

    As an example I had a client where one of the DW loads ran for very many hours which meant reports were not ready early enough in the day. ( obviously I'm simplyfing this explanation ) I suggested they apply some indexes - "oh we don't want to do that, indexes will slow the data load" - well yes maybe but they'd speed up the next part - so I added the indexes and it took several hours off the overall process and reports were actually ready before office opened.

    so what I'm saying is you do what you think is best - as for more indexes than data - yup in many databases/applications key tables have more used space for indexes than data. In fact if I look at one of my main production systems the indexes take more space than the data.

    Hope this helps as I think you're going down the right path - btw suggested indexes aren't always 100% take the suggestions as just that and double check yourself if you are able.

    [font="Comic Sans MS"]The GrumpyOldDBA[/font]
    www.grumpyolddba.co.uk
    http://sqlblogcasts.com/blogs/grumpyolddba/

  • DB_Newbie2007 (3/6/2013)


    We are working on trying to reduce the high number of IO waits on one of our production servers, a large percentage of which appears to come from one of our applications (based on the queries that have the highest accumulated physical reads, other investigations we have done).

    The application writes a large amount of data into a holding table (kind of like a temp table), then runs various queries to normalize (distribute the data into smaller tables with specific data, say one table for sales info, one table for invoices/sales info, one for employee info, one table for customer info, addresses, etc.) and partitions the data (into a month/year partition, i.e., Invoices_Oct12, Invoices_Nov12, etc.).

    I realize this is one of those "it depends" kind of questions, but I need to get a sanity check before I commit these changes:

    One of the stored procedures (the one that normalizes the data) runs 18 individual queries as part of the normalization. When reviewing the execution plans, each query produced a "missing index" recommendation. Just for testing, I applied all of the recommendations (they just go on the holding table, not the normalized tables receiving the final data). The overall query time does not change significantly (nature of the inserts?), but the CPU time, scan counts, logical and physical reads definitely go down (in some cases, 94% improvement). However, this comes at the expense of the indexes consuming more space than the data (3:1 ratio).

    Question: would it make sense to apply the indexes and live with the higher space consumption (which is kind of temporary and should not really increase over time) versus the CPU and read performance?

    I am thinking the answer in this case is YES, but I would be interested in other people's opinions. 🙂

    The application does bulk inserts, so I do not believe the initial inserts into the holding/temp table would be affected (still to be tested).

    Thanks! 🙂

    If this is a process where you blow away all data and reload new data each run, I would drop all indexes (except perhaps clustered) and truncate the table first. Then do your load while adhering to the various requirements to achieve minimally-logged inserts. That will get the data in MUCH more quickly. Now test with and without the indexes in place and make a decision whether you want them on or not. Obviously if the box NEEDS CPU ticks while this process is running you probably should build the indexes. Note that you can explicitly throttle the index builds too with the MAXDOP hint.

    You may want to get a professional on board to review your process and system. There could be some opportunities for improvement that you aren't aware of...

    Best,
    Kevin G. Boles
    SQL Server Consultant
    SQL MVP 2007-2012
    TheSQLGuru on googles mail service

  • It's rarely a good idea to just blindly implement every missing index recommendation, in my experience. If you look carefully at the queries you might find you can construct a single index that will speed up several of them at once, and that would make your inserts quicker as well because there are fewer indexes to update!

  • I totally agree about blindly accepting missing index recommendations! 🙂

    We run a process weekly that checks the server uptime (3 days or greater is our criteria), then capture all of the missing index recommendations across all of our databases/customer queries. I review them, then determine if any might make sense to implement.

    Ami Levin (http://www.sql-server-performance.com/2010/logical-reads/ ) had a good article that discussed the missing index recommendations and doing further fine tuning.

    Argue for your limitations, and sure enough they're yours (Richard Bach, Illusions)

Viewing 5 posts - 1 through 4 (of 4 total)

You must be logged in to reply to this topic. Login to reply