Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 
        
Home       Members    Calendar    Who's On


Add to briefcase 12»»

How fast can I know that the server is going down or is down Expand / Collapse
Author
Message
Posted Monday, March 24, 2014 11:18 AM
SSC-Enthusiastic

SSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-Enthusiastic

Group: General Forum Members
Last Login: Yesterday @ 7:43 AM
Points: 101, Visits: 470
I've been trying to figure out how to get the fastest alert of if a server is going down, is down, or was rebooted.
1. I can tell if a server has been rebooted and can write scripts to check the log file maybe daily.
2. But can I get the server to send out a message that a reboot has been issued?
3. I've also seen where I can set up some kind of alert notification that the system has been up for a short while, possibly indicating that is just coming back online.
4. Lastly what if the server goes down and doesn't come back online, my thought is only a scheduled external monitor would be able to alert to that condition.

Does anyone have any best practices? I'm thinking of setting up 1 and 3 for my small environment.
Any tips/advice would be appreciated.

thanks
Post #1554154
Posted Monday, March 24, 2014 12:18 PM


Hall of Fame

Hall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of Fame

Group: General Forum Members
Last Login: Monday, July 14, 2014 2:06 PM
Points: 3,865, Visits: 7,130
Best practices? It depends. Perhaps use a 3rd party software like centreon/nagios, or RedGate SQL Monitor, or Idera free admin tool (all would work).

For options 1 and 3 you could use a start up procedure that sends you an email once the service has been restarted - http://technet.microsoft.com/en-us/library/ms191129(v=sql.100).aspx


______________________________________________________________________________
"Never argue with an idiot; They'll drag you down to their level and beat you with experience"
Post #1554184
Posted Monday, March 24, 2014 1:23 PM
SSC-Enthusiastic

SSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-Enthusiastic

Group: General Forum Members
Last Login: Yesterday @ 7:43 AM
Points: 101, Visits: 470
thanks. I just needed some good ideas to get my thoughts flowing.
Post #1554201
Posted Monday, March 24, 2014 3:18 PM


SSCoach

SSCoachSSCoachSSCoachSSCoachSSCoachSSCoachSSCoachSSCoachSSCoachSSCoachSSCoach

Group: General Forum Members
Last Login: Today @ 10:03 AM
Points: 15,561, Visits: 27,938
You could put in a start up procedure that sends you an email each time the server comes online. That wouldn't tell you it was going down, but it would be self-contained. The real issue is that you can't simply have the server tell you it's going offline. Instead, you need another server to watch the first one (who watches the watchmen type deal). You can build this out yourself any number of ways. I liked using SQL Agent to set up a regular job that queries the server. If it fails, you know it's offline. That's one way to do it.

But, as was already said, building your own monitoring suite is a lot of work. Better to buy one. There are so many out there built out so much better than what you'll be able to put together that it just makes sense.

Fair warning, I work for a vendor, but I won't bring up the product.


----------------------------------------------------
"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood..." Theodore Roosevelt
The Scary DBA
Author of: SQL Server 2012 Query Performance Tuning
SQL Server 2008 Query Performance Tuning Distilled
and
SQL Server Execution Plans

Product Evangelist for Red Gate Software
Post #1554228
Posted Tuesday, March 25, 2014 7:58 AM


Hall of Fame

Hall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of Fame

Group: General Forum Members
Last Login: Monday, July 14, 2014 2:06 PM
Points: 3,865, Visits: 7,130
@Grant, yeah I had already thrown the bone in that direction

______________________________________________________________________________
"Never argue with an idiot; They'll drag you down to their level and beat you with experience"
Post #1554477
Posted Tuesday, March 25, 2014 8:02 AM
SSC-Enthusiastic

SSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-Enthusiastic

Group: General Forum Members
Last Login: Yesterday @ 7:43 AM
Points: 101, Visits: 470
thanks 2lwe actually use redgate backup pro and love it. the only thing about vendor software is I need to figure what I can do natively to understand truly how it works and my limitations before I can sell it to my bosses.
Post #1554481
Posted Tuesday, March 25, 2014 8:06 AM
SSC-Enthusiastic

SSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-Enthusiastic

Group: General Forum Members
Last Login: Yesterday @ 7:43 AM
Points: 101, Visits: 470
@mydog... yeah thanks. i hit up the msdn link and was able to put that into place for our servers for a temporary start. i'm going to research some monitoring apps and maybe scripts this week to try and get our environment more tied down. Today i'm going to put in some critical alerts, so any tips on those would be nice.

Post #1554485
Posted Tuesday, March 25, 2014 8:35 AM


Hall of Fame

Hall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of Fame

Group: General Forum Members
Last Login: Monday, July 14, 2014 2:06 PM
Points: 3,865, Visits: 7,130
To start, I'd recommend setting up default alerts for:
- Error Number 823 - IO/Hardware/System Issue detected
- Error Number 824 - IO/Logical Consistency Check Failed
- Error Number 825 - Read/Retry Warning
- Severity 016 - Miscellaneous User Error
- Severity 017 - Insufficient Resources
- Severity 018 - Nonfatal Internal Error
- Severity 019 - Fatal Error in Resource
- Severity 020 - Fatal Error in Current Process
- Severity 021 - Fatal Error in Database Process
- Severity 022 - Fatal Error: Table Integrity Suspected
- Severity 023 - Fatal Error: Database Integrity Suspected
- Severity 024 - Fatal Hardware Error Raised
- Severity 025 - Fatal Error


______________________________________________________________________________
"Never argue with an idiot; They'll drag you down to their level and beat you with experience"
Post #1554505
Posted Tuesday, March 25, 2014 10:10 AM


SSCertifiable

SSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiable

Group: General Forum Members
Last Login: Today @ 10:31 AM
Points: 7,122, Visits: 15,030
The SQL service logs when it is being shut down to the Windows event log. Short of it crashing down, you can use WMI instrumentation to have Windows notify you when those notices pop in the log.

System center and third party tools like BlueStripe (I believe that's the name) can also monitor the SQL browser services in your network for a list of "available" servers against the expected list. When a server is shut off, they will disappear from the list and the tool can notify.


----------------------------------------------------------------------------------
Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?
Post #1554567
Posted Tuesday, March 25, 2014 3:07 PM
SSC-Enthusiastic

SSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-Enthusiastic

Group: General Forum Members
Last Login: Yesterday @ 7:43 AM
Points: 101, Visits: 470
Thanks again for the helpful information. I was wondering if there is anything in the SQL Log that I should be monitoring with sp_readerrorlog that is outside of those alerts. I looked through our logs, and I didn't see anything outside of 'our' norm.. lots of I/O taking a long time to complete, but no one who can fix that is fixing it (i've sent tons of i/o metrics over the last year and an alert on that would just become daily/hourly noise).

I'm wondering if the sp_readerrorlog would be archaic to do, and I should just find a 3rd party tool that encapsulates all of that. BTW, this is on a DW, so most issues are low priority since we can recreate our DW to the previous day, versus OLTP.
Post #1554696
« Prev Topic | Next Topic »

Add to briefcase 12»»

Permissions Expand / Collapse