The Devil is in the Monitoring Details

  • Comments posted to this topic are about the item The Devil is in the Monitoring Details

  • If you only check the monitor data when something goes wrong, you're reacting to an issue, not being proactive in managing your servers.

    No, you won't catch every instance, but if you keep an eye on the monitors then you've a chance of catching something that's building towards becoming a serious, if not critical problem before it's gotten that far. Alerts are great for when there is an exception: When something unexpected happened and your server fails or is close to failure.

    As to internal tools v third party: I'd always go with Hybrid. Belt + Bracers + Gaffa Tape: You know your systems and know what normal behaviour is while third party tools can't always be configured to account for peculiarities in your servers or cover all the things you would prefer to monitor. So use third party tools to reduce your workload, but be prepared to add additional scripts, monitors and in house tools to cover the local peculiarities of your systems.

    That's from my experience at least.

  • I created a Windows service to monitor a web application and its database; it retrieves the application login page and also connects to the database and runs a query that is commonly used in the application. If an error occurs on the web server side, it will notify the server hosting team about the problem. For SQL Server errors, it will notify the DBA team and if the error is "Path not found" or other network related error, it will also notify the server hosting team. I added checking on when the application's security certificate will expire and it sends notifications a week from expiration. It would be nice to have the certificate expire outside of the hurricane season, but it expires during the height of hurricane season.
    It's not as fancy as the Stack Overflow monitoring, but it found when the security certificate expired last year.

Viewing 3 posts - 1 through 2 (of 2 total)

You must be logged in to reply to this topic. Login to reply