SQL Clone
SQLServerCentral is supported by Redgate
 
Log in  ::  Register  ::  Not logged in
 
 
 


Purging Data Increases IO performance?


Purging Data Increases IO performance?

Author
Message
richykong
richykong
SSC-Enthusiastic
SSC-Enthusiastic (196 reputation)SSC-Enthusiastic (196 reputation)SSC-Enthusiastic (196 reputation)SSC-Enthusiastic (196 reputation)SSC-Enthusiastic (196 reputation)SSC-Enthusiastic (196 reputation)SSC-Enthusiastic (196 reputation)SSC-Enthusiastic (196 reputation)

Group: General Forum Members
Points: 196 Visits: 621
TLDR: I purged about 50% of the data in a data warehouse during an upgrade and 3 days later, the IO_Completion wait stats drastically dropped. Is this normal?

I have a data warehouse loading process in which ~ 12 GB of numeric data is loaded from a binary data source to a newly created empty SQL table every night. Once this data is loaded, a new data file is created. A clustered key on the new file is created to move the loaded data from the Primary file to the new file. Once that's complete, some post load processing is run to pre aggregate the data for reporting.

After a few months of daily loading, we were told by the business that we should start purging some of the older data since it is completely useless after ~60 days. We were at about 1.5 TB of data at this point and the estimated reduction from the purge was about 1 TB. I performed the purge on Monday and the IO improvement wasn't seen till today (Wed).

Has anyone else seen IO improvements like this simply by freeing up the storage?

Server capacity is 2.45 TB. 15k RAID5
4 NUMA nodes with 20 proc's each
128 GB RAM
sqlGDBA
sqlGDBA
SSC-Enthusiastic
SSC-Enthusiastic (191 reputation)SSC-Enthusiastic (191 reputation)SSC-Enthusiastic (191 reputation)SSC-Enthusiastic (191 reputation)SSC-Enthusiastic (191 reputation)SSC-Enthusiastic (191 reputation)SSC-Enthusiastic (191 reputation)SSC-Enthusiastic (191 reputation)

Group: General Forum Members
Points: 191 Visits: 312
If the purge was accompanied by a clustered index rebuild OR if the choice of keys in the clustered index is right, the table should resize after the purge. And yes, given the B tree structure of indexes, the amount of I/O incurred while resolving queries can drop drastically. If you have the index stats before and after, that should show it too. Last but not the least, how do you know that the query load on your DW is a constant ? What if the amount of queries againsst this table has dropped recently ? You may want to ascertain that as well, before drawing any conclusions.
richykong
richykong
SSC-Enthusiastic
SSC-Enthusiastic (196 reputation)SSC-Enthusiastic (196 reputation)SSC-Enthusiastic (196 reputation)SSC-Enthusiastic (196 reputation)SSC-Enthusiastic (196 reputation)SSC-Enthusiastic (196 reputation)SSC-Enthusiastic (196 reputation)SSC-Enthusiastic (196 reputation)

Group: General Forum Members
Points: 196 Visits: 621
I forgot to mention that each load is done to a new table. At the start of the each load process, a new table.

The process happens ~1:00 am every night and the only other load against the server is from my Data Collection for the MDW. The server is currently only used to load the DW and pump out some reports in the morning. No users perform any queries against the server. I have been measuring the load process duration and IO stats and it has been fairly stable for months. Last night's run was the first time I've seen this happen.
DiverKas
DiverKas
Old Hand
Old Hand (321 reputation)Old Hand (321 reputation)Old Hand (321 reputation)Old Hand (321 reputation)Old Hand (321 reputation)Old Hand (321 reputation)Old Hand (321 reputation)Old Hand (321 reputation)

Group: General Forum Members
Points: 321 Visits: 460
Could autogrowth be accounting for this? If the files were growing each time, and the growth was wrong, you could incur a ton of IO, and after the purge, provided no shrink occured, would certainly be large enough to take in the loads without growing.

Just a guess.
Koopa
Koopa
Grasshopper
Grasshopper (12 reputation)Grasshopper (12 reputation)Grasshopper (12 reputation)Grasshopper (12 reputation)Grasshopper (12 reputation)Grasshopper (12 reputation)Grasshopper (12 reputation)Grasshopper (12 reputation)

Group: General Forum Members
Points: 12 Visits: 59
How many disks are in your array? Is it one big LUN for all disks?

It could be the simple fact that now that you've cleared up 1TB of space at the start of your LUN (outer edges), Windows is allocating the new data files at the start of the LUN, which will be faster, especially for sequential reads. Compound that performance increase over the number of disks in your array.... using only fast outer edges of a drive is called "short stroking".

Try to see where your files are sitting physically on the disks.
ScottPletcher
ScottPletcher
SSCrazy Eights
SSCrazy Eights (8.2K reputation)SSCrazy Eights (8.2K reputation)SSCrazy Eights (8.2K reputation)SSCrazy Eights (8.2K reputation)SSCrazy Eights (8.2K reputation)SSCrazy Eights (8.2K reputation)SSCrazy Eights (8.2K reputation)SSCrazy Eights (8.2K reputation)

Group: General Forum Members
Points: 8160 Visits: 7163
It's normal IF the indexing on the data was not optimal, so that you were scanning old data before that has now been removed.

Note that when you DELETE a large number of rows SQL defers the actual release of the data pages until later. A batch process later removes these "ghost" rows in increments. For 1TB data, I guess it's possible that it took more than 2 days to completely DELETE the ghost rows.

You should rebuild the affected tables after that large a DELETE.

You should also review index statistics, including missing indexes and current index usage and table scan counts, to verify whether you need to re-index one or more of the tables.

SQL DBA,SQL Server MVP(07, 08, 09)[size=2]Prosecutor James Blackburn, in closing argument in the Fatal Vision murders trial: If in the future, you should cry a tear, cry one for them [the murder victims]. If in the future, you should say a prayer, say one for them. And if in the future, you should light a candle, light one for them.[/size]
Go


Permissions

You can't post new topics.
You can't post topic replies.
You can't post new polls.
You can't post replies to polls.
You can't edit your own topics.
You can't delete your own topics.
You can't edit other topics.
You can't delete other topics.
You can't edit your own posts.
You can't edit other posts.
You can't delete your own posts.
You can't delete other posts.
You can't post events.
You can't edit your own events.
You can't edit other events.
You can't delete your own events.
You can't delete other events.
You can't send private messages.
You can't send emails.
You can read topics.
You can't vote in polls.
You can't upload attachments.
You can download attachments.
You can't post HTML code.
You can't edit HTML code.
You can't post IFCode.
You can't post JavaScript.
You can post emoticons.
You can't post or upload images.

Select a forum

































































































































































SQLServerCentral


Search