The Devil is in the Monitoring Details

Question

The Devil is in the Monitoring Details

Steve Jones - SSC Editor

SSC Guru

Points: 742868
More actions
January 26, 2019 at 11:55 am

#373642

Comments posted to this topic are about the item The Devil is in the Monitoring Details

Viewing 3 posts - 1 through 3 (of 3 total)

You must be logged in to reply to this topic. Login to reply

steve.powell 14027 SSC Veteran Points: 246 More actions · Answer 1

If you only check the monitor data when something goes wrong, you're reacting to an issue, not being proactive in managing your servers.

No, you won't catch every instance, but if you keep an eye on the monitors then you've a chance of catching something that's building towards becoming a serious, if not critical problem before it's gotten that far. Alerts are great for when there is an exception: When something unexpected happened and your server fails or is close to failure.

As to internal tools v third party: I'd always go with Hybrid. Belt + Bracers + Gaffa Tape: You know your systems and know what normal behaviour is while third party tools can't always be configured to account for peculiarities in your servers or cover all the things you would prefer to monitor. So use third party tools to reduce your workload, but be prepared to add additional scripts, monitors and in house tools to cover the local peculiarities of your systems.

That's from my experience at least.

Ralph Hightower SSCrazy Points: 2843 More actions · Answer 2

I created a Windows service to monitor a web application and its database; it retrieves the application login page and also connects to the database and runs a query that is commonly used in the application. If an error occurs on the web server side, it will notify the server hosting team about the problem. For SQL Server errors, it will notify the DBA team and if the error is "Path not found" or other network related error, it will also notify the server hosting team. I added checking on when the application's security certificate will expire and it sends notifications a week from expiration. It would be nice to have the certificate expire outside of the hurricane season, but it expires during the height of hurricane season.
It's not as fancy as the Stack Overflow monitoring, but it found when the security certificate expired last year.