Best way to mass delete partitioned data?

  • We have a > 40 TB data warehouse that once resided on a single SQL 2008 Enterprise instance which was "mostly" migrated to a three node SQL 2012 Enterpise failover cluster.

    The largest tables are partitioned, and we use a relatively simple partitioning key on a datetime2 column, which is also the clustering key. Only the 2012 and 2013 data are "live" on the new failover cluster.

    2012 and older data also still resides on the single instance 2008 server. SAN storage is getting skinny and we need to delete the 2012 data on the single instance and reclaim the "double dipped" space.

    So my question is, what is the best approach to mass delete a large quantity of partitioned data? My target is only the 2012 data, which should reside on very specific filegroups. We used only 1 file per filegroup, for example with a very boring but descriptive logical name like '20120901File'.

    Can this be as simple as simply updating the partitioning scheme and dropping the 2012 file groups? Or if I do this will I be causing some insidious corruption?

    Do I need to take a more conservative approach and do batch row deletions in something like a while loop (to enforce say 1000 row batches) while also switched to bulk logged recovery model to avoid trashing the transaction log?

    Is there a way to get clever with partition switching? I have done sliding windows in the past, but this isn't quite what I'm after.

    Ballpark estimates are deleting on the order of tens of billions of rows, and 10 - 20 TB of data, so you can see the appeal of being able to quickly drop the floor out from under a bunch of data by simply nuking an entire filegroup. Just not sure if that is feasible and won't have other undesirable consequences.

    As a last thought, I believe our setup has indexes aligned.

    Cheers,

    Jared

Viewing 0 posts

You must be logged in to reply to this topic. Login to reply