Data Lifecycle Management using SQL Server

  • Comments posted to this topic are about the item Data Lifecycle Management using SQL Server

  • You've missed the select statement in scenario 1 after "INSERT INTO Seattle_Police_Department_911_Incident_Response" to put the data back in from the temp table.

    Otherwise, interesting read!

  • Thank you for noticing. I submitted an update to complete the statement.

  • Nice article. I like the data scenario you built around it.

  • Thank you. Always good to work with real open data.

  • Thanks for writing this article. I have been meaning to read it since it posted about a week ago. Very good advice for anyone stuck in the position of needing to clean up massive change/log files that have accumulated. As it happens, today I am onsite at a client where we manage an Upsert process (that includes dealing with - argh - deletes on the source system) and as the target tables average between 750 million and 1.2 billion rows (yes, properly tuned, SQL Server does scale well), we are VERY concerned with the constant expansion of our change_log tables. We employ a *fifth* strategy which you didn't mention, which is to create and implement a process to constantly remove records older than *n* number of days from the change_log tables - we simply prune them on a daily basis via tasks late in the ETL, and occasionally write the remaining data out to temp tables and back in as a way to manage fragmentation on the disk.

  • That's a challenging project you are working on. Auditing and logging tables are really underrated as far as records management goes. Everyone wants to keep track of changes, but rarely there is a SLA stating what will be the retention period and a storage budget to insure it's feasible.

  • Agreed. Ironically, just last week I gave a presentation at the Silicon Valley Microsoft Data Platform User Group meeting on ETL logging. Half that presentation makes the case for partnering with the business to create a formal Logging Policy (and offers steps for how to create one) - data and ETL engineers know what kind of logging they need to serve their own purposes, but if one asks the right business stakeholders, one can often find parts of the data architecture or parts of the loading process architecture which are of extreme interest to the business. Creating a formal Logging Policy and/or SLA around those helps bridge the gap between the two worlds, to say the very least. If you would like the deck that goes with the presentation, I'm happy to send.

Viewing 8 posts - 1 through 7 (of 7 total)

You must be logged in to reply to this topic. Login to reply