Purging Data Increases IO performance?

  • TLDR: I purged about 50% of the data in a data warehouse during an upgrade and 3 days later, the IO_Completion wait stats drastically dropped. Is this normal?

    I have a data warehouse loading process in which ~ 12 GB of numeric data is loaded from a binary data source to a newly created empty SQL table every night. Once this data is loaded, a new data file is created. A clustered key on the new file is created to move the loaded data from the Primary file to the new file. Once that's complete, some post load processing is run to pre aggregate the data for reporting.

    After a few months of daily loading, we were told by the business that we should start purging some of the older data since it is completely useless after ~60 days. We were at about 1.5 TB of data at this point and the estimated reduction from the purge was about 1 TB. I performed the purge on Monday and the IO improvement wasn't seen till today (Wed).

    Has anyone else seen IO improvements like this simply by freeing up the storage?

    Server capacity is 2.45 TB. 15k RAID5

    4 NUMA nodes with 20 proc's each

    128 GB RAM

  • If the purge was accompanied by a clustered index rebuild OR if the choice of keys in the clustered index is right, the table should resize after the purge. And yes, given the B tree structure of indexes, the amount of I/O incurred while resolving queries can drop drastically. If you have the index stats before and after, that should show it too. Last but not the least, how do you know that the query load on your DW is a constant ? What if the amount of queries againsst this table has dropped recently ? You may want to ascertain that as well, before drawing any conclusions.

  • I forgot to mention that each load is done to a new table. At the start of the each load process, a new table.

    The process happens ~1:00 am every night and the only other load against the server is from my Data Collection for the MDW. The server is currently only used to load the DW and pump out some reports in the morning. No users perform any queries against the server. I have been measuring the load process duration and IO stats and it has been fairly stable for months. Last night's run was the first time I've seen this happen.

  • Could autogrowth be accounting for this? If the files were growing each time, and the growth was wrong, you could incur a ton of IO, and after the purge, provided no shrink occured, would certainly be large enough to take in the loads without growing.

    Just a guess.

  • How many disks are in your array? Is it one big LUN for all disks?

    It could be the simple fact that now that you've cleared up 1TB of space at the start of your LUN (outer edges), Windows is allocating the new data files at the start of the LUN, which will be faster, especially for sequential reads. Compound that performance increase over the number of disks in your array.... using only fast outer edges of a drive is called "short stroking".

    Try to see where your files are sitting physically on the disks.

  • It's normal IF the indexing on the data was not optimal, so that you were scanning old data before that has now been removed.

    Note that when you DELETE a large number of rows SQL defers the actual release of the data pages until later. A batch process later removes these "ghost" rows in increments. For 1TB data, I guess it's possible that it took more than 2 days to completely DELETE the ghost rows.

    You should rebuild the affected tables after that large a DELETE.

    You should also review index statistics, including missing indexes and current index usage and table scan counts, to verify whether you need to re-index one or more of the tables.

    SQL DBA,SQL Server MVP(07, 08, 09) A socialist is someone who will give you the shirt off *someone else's* back.

Viewing 6 posts - 1 through 5 (of 5 total)

You must be logged in to reply to this topic. Login to reply