Indexes designed for reading from tables intended for writing

  • Here's my scenario:

    I have a production database which stores information regarding campaigns. This database contains several tables which are used to store the details of the campaigns. It also contains one table that is used to store the activity of the campaign.

    The campaign detail tables are written to on a daily basis, in the range of 10-100 times per day.

    The campaign activity table is written to on a daily basis, in the range of 50000-500000 times per day.

    As it stands, right now, these tables all contain indexes which are intended for the purposes of accelerating the reading speed. A small number of these indexes are used when the writing is taking place, and therefore need to exist either way.

    However, the vast majority of them are in place for the purpose of querying the database.

    Now, as the size of the tables has increased, the impact of the queries against these tables has correspondingly increased. Even with indexes covering the queries, they were still somewhat intensive, and would also occasionally cause problems with the performance.

    As a result, at some point I decided to create a table with aggregate stats from the activity table, and use that table instead, for the purposes of reporting and querying. This worked great - the queries against the aggregate table were instantaneous, and the load against the main table was reduced to a small fraction of the original.

    However, the indexes are still there on the tables, for one simple reason - debugging purposes. Whenever there are problems with the campaigns, I need to go into the database and run various queries to investigate the cause of the problems.

    As such, there are numerous indexes which are used only seldomly, but are being updated regularly, and are thus needing rebuilding/reindexing.

    Unfortunately this rebuilding/reindexing is very intensive at times. In fact, on the activity table, it results in a downtime of around 10-15 minutes, which can result in a loss of around 1000-1500 writes against the table.

    So, here's the question that I have. The indexes are unavoidable - they have to exist for me to be able to do my debugging. However, at what point should I decide to run with a duplicate database? Is the activity against my current database, and the fact that I have only myself as the user for which the debugging queries are being run, enough to warrant making a new database and performing a backup-restore every day to the duplicate database, and use the new database for queries, removing the indexes on the old one?

    Also, is there any other way to avoid that downtime for reindexing? I know I can rebuild instead of reindex, which is faster, but the indexes in question are very fragmented due to the large amount of writes taking place every day. It would be nice if I could do the online reindexing, but that requires Enterprise edition, which I don't have. And, to my knowledge, reindexing is quite important, since otherwise the index is going to be essentially crippled due to the fragmentation.

  • Hi,

    I am not sure if you have solved your problem yet, but here's what I think about it.

    Look at the debugging on a production database from business point of view. Even if you are the only person accessing the database this way, your queries may disrupt production processes which have to deal with locking and possibly memory pressure, because you load data to memory that are not required by the insert processes. I think this alone would give a good reason to work on a copy rather than the production database. You should restore the copy on a separate server to avoid any interruption with production if possible.

    Regarding the second question, the first idea that comes to my mind is if you can identify somehow parts of the tables that do not change? As in campaign information for campaigns which have finished and there are no updates (or not too many) for them? You could split the tables to make size of particular indexes smaller. Then you could create partitioned view on top of these tables, which could still be updateable, so from point of view of client applications nothing would change.

    The mostly readonly data wouldn't have to have indexes updated so often, and even so, the operation would be faster.

    If you cannot identify readonly data it might still make sense to divide the table into smaller ones, using a date as partitioning key for example.

    Regards

    Piotr

    ...and your only reply is slàinte mhath

  • How many times do you have to debug using these indexes?

    An easy (but ugly) solution if it happends very few times you could create the index on the database when needed and drop them when solved.

    Another one could be to use log shipping and when you need to debug you can create a backup of the log shipped DB and restore in a dev environtment and create there the indexes. You may need extra time but your prod system wouldn't have a much overload cost.

    Josep

Viewing 3 posts - 1 through 2 (of 2 total)

You must be logged in to reply to this topic. Login to reply