Spring Cleaning

  • Comments posted to this topic are about the item Spring Cleaning

  • We always say "disk is cheap". But we know that backup time and space, network bandwidth, data recovery time in a crisis, etc. are not cheap.

    We intend to keep all "real" production data, forever. (The production database for our mid-sized company is about 40GB now, with 12 years' data).

    We have scheduled scripts to purge "not real" data from the system after say 30 days. Example: emails that are received into our system, parsed, and saved as other records. We keep the "other records" forever, but we purge the emails after 30 days... enough time to catch any unexpected bugs.

    We have very-detailed logging of our application's usage in separate databases, and we spin off new copies of those once a year or so. Those databases are 50GB each or so; the most recent is online and older ones are offline; only brought online when needed. Those are only backed up once, not part of our nightly jobs; if we lost them in a crisis we wouldn't care all that much, but as long as we have the disk space, we keep them around.

    Our biggest issue with spring cleaning is removing code and data tables/columns that are no longer used. It's so much lower risk to just leave them be, but we so badly want to get rid of them.

  • We do "spring cleaning" on a regular basis. Try to purge any records that are not essential to the business, clean out bad processes or databases from prior dba's etc.

    There is always something to be doing in regards to spring cleaning and ensuring the environment is fit.

    Jason...AKA CirqueDeSQLeil
    _______________________________________________
    I have given a name to my pain...MCM SQL Server, MVP
    SQL RNNR
    Posting Performance Based Questions - Gail Shaw[/url]
    Learn Extended Events

  • We have an application with about 15 years of data still sitting there. Spring cleaning has not been undertaken, with the justification that disk is cheap but orderly cleanout is very expensive.

    Interestingly, we needed an upgrade. Unfortunately the inventory component has been re-written, and the extensive updates and table changes required for the upgrade on the 15-year inventory database now mean that we cannot perform the upgrade even in our longest upgrade window over the public holiday.

    Now we are going through an even more expensive process - implementing the new version with a starting point of NO data - somehow bring across master tables and job templates as required, start again.

    Sad but true - and a cautionary tale for those "disk is cheap" proponents that you can be burned by failure to consider the information life-cycle!

  • I'm in the process of a related but slightly different spring cleaning. I have some scripts and batches that iterate a list of SQL servers, and return a list of all the login accounts with SysAdmin.

    I intend to challenge every one found except the "PROD\SQL Admins" group (containing my own login and that of my co-DBA). On production instances I want to reduce each service account to read/write, and each application admin to read-only. On Dev-Test-D/R boxes I would allow developers and possibly application admins to have read-only, and developers to do DDL. Exceptions by vendors to be documented for service accounts used by third party apps.

    Whenever my challenge is overruled, that response will in turn be supplied with my response to the next audit.

    CYA, people.

    Mike Hinds Lead Database Administrator1st Source BankMCP, MCTS

  • Spring cleaning systems is rarely done because you're always on to the next new project. Your own in-house projects aren't seen as money makers, so, they're always on the back burner. There have been several attempts where I work to clean up but no one can ever find the time.

  • In addition to the technical costs, there can be a legal risk (cost) with retaining certain types of data. My company has taken a stance on certain data being purged after a set retention period. There were several instances of lawsuits (divorces I believe) where we were not involved in the case, but it was known that we held certain financial records that were determined crucial to the case. Each subpoena caused a lot of manual work to retrieve those records and was enough to be measurable as an expense. To reduce this expense, we now retain the records just long enough for us to do business with our customers.

    We do take certain types of detail data and distill them to aggregates after a certain time so that we can measure trends, etc. but have no legal exposure to holding the detailed data.

    Each time we start new applications, we have a quick categorization of the types of data we are capturing and review the retention period with product owners (and legal if need be) and then build the retention/purge mechanisms in from the beginning.

    In my 20+ years of IT, I never thought that the law would impact my retention plan more than the cost of disks. Times have changed.

  • Yes, absolutely. Whether it is "Spring", "Fall" or "when you have the time" cleaning. We are watchful for how long we store unused data. Freeing up disk space or cleaning unused processes are but one of the many tasks a dba can perform to help ensure system maintainability, performance and backup and restore efficiencies.

    - Jeff

  • Rob Nickolaus-860201 (4/23/2010)


    In addition to the technical costs, there can be a legal risk (cost) with retaining certain types of data. My company has taken a stance on certain data being purged after a set retention period. There were several instances of lawsuits (divorces I believe) where we were not involved in the case, but it was known that we held certain financial records that were determined crucial to the case. Each subpoena caused a lot of manual work to retrieve those records and was enough to be measurable as an expense.

    Like Rob, our County has been brought into legal dilemmas by the data we keep. Tax and/or property issues have resulted in subpoenas coming our way to pull County data from emails, court records, or 45 year old tax records. We have, in the past two years begun following the state mandated retention periods for many, many different types of documents. While it is a pain to stay on top of, it makes things a lot easier to tell an attorney that our records go so far and that's it. Very long term documents are sent to the State Repository in another county, so our storage issues have been significantly reduced.

  • Absolutely yes! We just completed a production archival of over 150 million records we no longer needed on our daily production boxes. It's not really complex to do, but you must have a clearly thought-out plan going into it and have all your ducks in a row before starting. As a result, database size, index size, log shipping, replication, etc. has benefited from doing this. We do it March-April of every year. Highly recommended if you want to keep your db servers humming! 😀

    "Technology is a weird thing. It brings you great gifts with one hand, and it stabs you in the back with the other. ...:-D"

  • I go by the motto "Why wait for spring? Do it now!" as we've found once a year is nowhere near the frequency needed for database cleanup. With upwards of 50 separate test databases plus copies of client databases spread over a half dozen physical or virtual servers I have to perform "clean up" at least quarterly. It can be a real battle with some individuals as they just do not want to part with that 5 year old database that hasn't been touched in two years. Yet these same people raise quite a fuss when there is insufficient disk space on the server to host their latest 20 GB copy of a client's database. For the most part everyone is quite reasonable about cooperating with the managing our use of server disk space.

  • Good to see some people doing this.

    I'm torn on the disk space argument. It does cost $$, but so does people's time. If there are other things to do, is it worth making them clean things up?

  • Steve Jones - Editor (4/23/2010)


    Good to see some people doing this.

    I'm torn on the disk space argument. It does cost $$, but so does people's time. If there are other things to do, is it worth making them clean things up?

    Excellent point. Data archiving and cleanup are generally on my list of things to do, but being the only DBA in charge of number of production, QA, and development servers in the past, it was way down on that list. 😀

    "Technology is a weird thing. It brings you great gifts with one hand, and it stabs you in the back with the other. ...:-D"

  • Steve, how big is the SQLServerCentral.com database?

  • Steve Jones - Editor (4/23/2010)


    Good to see some people doing this.

    I'm torn on the disk space argument. It does cost $$, but so does people's time. If there are other things to do, is it worth making them clean things up?

    Flip the argument around: those files each user holds onto unnecessarily need to get backed up each and every time a server backup is performed, so they add to the "time" effort one way or another. So delete them once, or - back them up each and every day from now until, well, forever?

    ----------------------------------------------------------------------------------
    Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?

Viewing 15 posts - 1 through 15 (of 19 total)

You must be logged in to reply to this topic. Login to reply