• tjaybelt (10/27/2008)


    Steve Jones - Editor (10/27/2008)


    To me these are things that are daily checks, things you need to make sure are available if you get called when you're on-call. Issues should be documented and also raised as some sort of turnover for the next person to be aware of.

    My goal has always been to automate these things, but also to proactively manage the systems, looking to catch things before they fail so that you're not getting called!

    I can see automating some of the items i have detailed. But you are correct, some manual intervention still needs too occur, to ensure that your systems are in proper working order.

    Something that can gather and store the historical data would be nice. Automating some of the data gathering would be nice. But having a human DBA look at it, still seems like a necessity.

    Another DBA here just went on call this week, and thought to grab snapshots with SQL Compare as part of his on0call duties. Nice addition. If something goes wonky during the week, we have a snapshot to compare with...

    On the same vein of automating, one useful tool is an automated restore of your backups on another server.

    That assumes you have another server(s) available - in our case we have a customer log-shipping of LitetSpeed backups to off-site DR servers. This is configured to keep those off-site DB servers 1 hour behind the live servers (which CAN be invaluable when a junior DBA says ... "oooops!!" when asked to do an update). In conjunction with MOM reporting set to alert (via SMS and E-mail) on failures, we have automated, alerting knwoeldge of the viability of our backups.

    Needless to say, several of the other obviously automatable tasks (like the space available etc.) are also configured to send MOM alerts on threshold values (typically at 80%, 90 %, 95% and 99% full - so no server should run out of space without at least 4 alerts).

    This does add an additional check on the "daily checks" list however - make sure MOM is up and working 😀