I participated in a week long hackathon recently. It was great to be able to spend the whole week on a self-directed project. I’m excited to write about what my team accomplished, but actually I want to blog about what another team accomplished. That team implemented a really nice “send to recycle bin” feature and they gave me the green light to write about it here.
The recycle bin feature is ultimately a data-hiding feature. Users don’t necessarily want to destroy data, they just don’t want to look at it right now. There are a lot of ways to implement this feature, but one way is by making a few changes in the database (as opposed to the application).
What Needs To Change?
Surprisingly not much. Take your table and give it a nullable RecycleDate
column. This is all you need to track the recycled rows. Then create a view that filters out recycled items. That’s pretty much it. Afterwards, if you rename the table, then the view can take its place. This is what that would look like on Adventureworks’ Sales.ShoppingCartItems
table:
ALTER TABLE Sales.ShoppingCartItem ADD RecycleDate DATE NULL CONSTRAINT DF_ShoppingCartItem_RecycleDate DEFAULT NULL; GO EXEC sp_rename 'Sales.ShoppingCartItem', 'AllShoppingCartItems' GO CREATE VIEW Sales.ShoppingCartItem WITH SCHEMABINDING AS SELECT ShoppingCartItemID , ShoppingCartID , Quantity , ProductID , DateCreated , RecycleDate FROM Sales.AllShoppingCartItems WHERE RecycleDate IS NULL; GO CREATE PROCEDURE Sales.s_RecycleShoppingCartItem ( @ShoppingCartItemId INT ) AS UPDATE Sales.ShoppingCartItem SET RecycleDate = GETDATE() WHERE ShoppingCartItemID = @ShoppingCartItemId; GO |
DML Impact
So what’s the impact on other Delete, Insert, Update or Select statements that are executed against your modified table?
- Delete statements shouldn’t be affected. You’ll notice that recycle bin contents can’t be deleted via the view. That’s okay.
- Old Insert statements should work as expected with no adjustments, especially if you name your columns in a column list.
- Update statements? Check, they’ll continue to work.
- Select statements will also be unaffected. Especially if you’ve avoided SELECT *.
What About Foreign Keys?
Okay, this is where it gets little tricky. If you don’t use ON DELETE
or ON UPDATE
clauses with your foreign keys, then you have to be a little careful. I want to show just one example of how things can get a bit messy. Returning to our Adventureworks example, lets think about a query that deletes “shopping carts” as long as it has no items.
DELETE Sales.ShoppingCart WHERE ShoppingCartId = @ShoppingCartIdToDelete AND NOT EXISTS ( -- any items in the cart? SELECT 1 FROM Sales.ShoppingCartItem WHERE ShoppingCartId = @ShoppingCartIdToDelete ) |
In the old world, this works no problem. But our check for items in the cart misses items that have been recycled and so this query would fail. You’ll have to remember to find queries like this and update them to check Sales.AllShoppingCartItems
instead.
Data Lifecycle Policy Concerns
You have a policy right? The lack of one can make it too easy to retain data indefinitely. The concern isn’t necessarily storage, but whether you’re meeting any policies or regulations concerning privacy or other things like that.
The recycle bin feature may make it a little easier to accidentally retain data you didn’t mean to. It may be worth regression testing any delete or purge functionality that you have.
Indexing
Depending on how much data is hidden in the recycle bin, you shouldn’t have to re-evaluate your indexing strategy. Your indexes should probably serve you just as well after this implementation. But if you find yourself storing more than 90% of your data as recycled data, then you may want to start considering re-assessing the table’s indexes. You could consider things like filtered indexes, filtered stats and/or partitioned tables. But before you do, see Data Lifecycle Policy Concerns above.
Other Things To Watch
Any changes to schema or any code should lead to extra testing and the changes I’m proposing are no different.
You have to know your app and environment. Is your recycle bin against a table that participates in downstream Business Intelligence projects? How about Change-Data-Capture? Service Broker? Notification Services? You know better than I do.
Other Reycling Bin Implementations
There are lots of methods.
For example, You don’t have to implement this pattern using SQL. You can implement it in your application. Hiding recycled data via the application makes a lot of sense. Especially if your more of a programmer than a SQL developer (By the way, where’d you come from? Who let you in here?)
It’s worth giving this some thought. Without a recycle bin, the demand to retrieve “deleted” data can be great enough to prompt someone to dig through a restored backup. Digging through restored backups actually counts as a recycle bin implementation even if it is an unintentional and painful one.
[…] Implementing the Recycle Bin Pattern In SQL, Michael J. Swart […]
Pingback by SQL Recycle Bin Pattern - The Daily Six Pack: April 14, 2014 — April 14, 2014 @ 5:33 am
Thanks for sharing Michael. Would be interesting to see some discussions on “on-table recycling” vs “off-table recycling”.
Comment by Louis Li — April 14, 2014 @ 6:08 pm
Thanks for sharing! That’s a great approach…
Comment by Thiago — April 22, 2014 @ 10:16 am
[…] Implementing the Recycle Bin Pattern In SQL – Michael J. Swart (Blog|Twitter) […]
Pingback by (SFTW) SQL Server Links 25/04/14 • John Sansom — April 24, 2014 @ 1:06 pm
Nice article, thanks. One small correction:
I think you need to place the EXEC sp_rename statement before the ALTER TABLE Sales.AllShoppingCartItems block, otherwise you’re trying to alter a non-existent object.
Comment by Rich — April 25, 2014 @ 9:08 am
Nice catch Rich, Thank you!
I’ve corrected the script so that it works properly.
Comment by Michael J. Swart — April 25, 2014 @ 12:18 pm