SQL Server Restart or Cluster Failover Notification

  • Comments posted to this topic are about the item SQL Server Restart or Cluster Failover Notification

  • another way is check sqlserver_start_time from sys.dm_os_sys_info.

    very simple and it works for cluster as well

  • One question about the failover notification, say, we have a two nodes cluster, the current active node is server01, the passive node is server02, when failover happens, server02 will be active, your job will send a failover notification. But what if the cluster is down and then bring server02 up first? Can we tell this is a restart instead of a failover?

  • Thanks for the good article.

  • julian.wu.cn (12/10/2015)


    One question about the failover notification, say, we have a two nodes cluster, the current active node is server01, the passive node is server02, when failover happens, server02 will be active, your job will send a failover notification. But what if the cluster is down and then bring server02 up first? Can we tell this is a restart instead of a failover?

    As the job will fire when the SQL Agent service starts, no, there's no way to differentiate between a restart / failover / manual restart of the service.

    You *might* be able to find out if it was a restart (user-initiated) by using Powershell to read the OS event logs, then based on that sending or not sending the alert, but that's going to be a fairly involved thing to set up.

  • The code does try to notify of a restart on a cluster. If the SQL cluster is active on nodeA and for some reason you take the cluster offline for maintenance or other work then bring it back online on the same nodeA, the job will compare the current and previous node values to see if they are the same or different. If they are the same then you should get an email that says the cluster services may have been restarted but not failed over. If they are different then you should get a notification that a failover has happened.

  • Of note, this will not work as expected in an Always On cluster scenario. In Always On the SQL Agent is running even when a node is considered secondary.

    That said, there is a way to use ALERTS to notify an operator when a failover occurs in an Always ON environment:

    http://sqlmag.com/blog/alwayson-ags-and-sql-server-jobs-setting-failover-alerts

  • Lori

    Sql 2012/14 AlwaysOn feature uses clustering technology underhood. Will your script work in this situation?

    I am interested to know when and why failover occurs in AlwaysOn setup.

    Thanks

    KD

  • KD - This would not work for an AlwaysOn failover because on the secondary all services are always up. I think to get an alert for an AlwaysOn failover all you would need to do is set up the below alert and notification on each replica and have it send an e-mail to you or your team when it fires.

    EXEC msdb.dbo.sp_add_alert

    @name = N'AG Role Change (failover)',

    @message_id = 1480,

    @severity = 0,

    @enabled = 1,

    @delay_between_responses = 0,

    @include_event_description_in = 1;

    GO

    EXEC msdb.dbo.sp_add_notification

    @alert_name = N'AG Role Change (failover)',

    @operator_name = N'[Operator]’,

    @notification_method = 1;

    GO

    My team wrote a post on our company blog about AOAG alerts that you can see here: http://blog.sqlrx.com/2015/08/27/alwayson-monitoring-and-alerting/

  • there is a simpler way to actively checking the windows event log. You always get notification just at the time the failover happens rather than afterward. Here is the step:

    Create a SendFailoverEmailToDBA in window scheduled task(This event triggered task was created by you to track SQL server failover.)

    in edit trigger:

    Begin the task "On an event"

    Log "Application"

    Source "YouSQLServerName"

    EventId "35266" --This is the failover event. There are other event that is related failover as well.

    in edit action

    Action: "Start a program"

    Program/Script: "D:\Program Files\Microsoft SQL Server\110\Tools\Binn\sqlcmd.EXE"

    Add Argument(Optional): "-Q "EXEC msdb.dbo.sp_send_dbmail @profile_name = 'SQLMail', @recipients='dba@yourcompany.com', @subject='Failover Event Occurred!',@body='There has been failover event occurred. Please find out what and why.', @reply_to='dba@yourcompany.com'""

    That's it. I had it setup for four years ago and it never missed any failover event.

  • Great technique Lori and I very much like the HTML format.

  • Nice technique, Lori!

    I parameterized a few things because, through no fault of my own, my server configurations are like a box of chocolates.

    I don't have to deal with clustered SQL Server instances, so this code is for STAND-ALONE INSTANCES ONLY.

    Replace 'fake@fake.com' with your individual address, 'smooth' with an existing operator, and 'high' with an existing DBMail profile.

    Run the code, run the restart_ALERT email job, marvel at the email, change the email to your DBA group's email address, run the code twice for some reason, and you are done.

    No warranties expressed or implied.

    DECLARE@jobId binary(16)

    , @servername varchar(50) = (select @@Servername)

    , @operator varchar(50) = 'smooth'

    , @profile varchar(50) = 'high'

    , @email varchar(50) = 'fake@fake.com'

    DECLARE@commandstage varchar(MAX) = N'SET NOCOUNT ON

    DECLARE @body1 NVARCHAR(MAX)

    -- instructions

    SET @body1=''

    The ' + @servername + ' Instance has been restarted.''

    -- table attributes

    SET @body1=@body1+''<table border="2" cellspacing="2" cellpadding="2">''

    -- column headers

    SET @body1=@body1+ ''<tbody align="left" style="font-family:Arial; font-size: 11;" <TR><TH>Service Name</TH><TH>Startup Type</TH><TH>Status</TH><TH>Startup Time</TH></TR></tbody>''

    -- data

    SELECT @body1=@body1 +''<tbody align="left" style="font-family:Arial; font-size: 11;" <TR><TD>''+servicename+''</TD><TD>''+startup_type_desc+''</TD><TD>''+status_desc+''</TD><TD>''+CAST(last_startup_time AS VARCHAR(30))+''</TD></TR></tbody>'' FROM sys.dm_server_services

    -- Send an html formatted e-mail to notify of restart

    EXEC msdb.dbo.sp_send_dbmail

    @profile_name = ''' + @profile + ''',

    @recipients = ''' + @email + ''',

    @subject = ''SQL restart on ' + @servername + ''',

    @body = @body1,

    @body_format = ''HTML''

    SET NOCOUNT OFF'

    --out with the old

    SELECT @jobId = job_id

    FROM msdb.dbo.sysjobs WHERE (name = N'restart_ALERT')

    IF (@jobId IS NOT NULL)

    BEGIN

    EXEC msdb.dbo.sp_delete_job @jobId

    END

    --in with the new

    EXEC msdb.dbo.sp_add_job@job_name=N'restart_ALERT',

    @enabled=1,

    @notify_level_eventlog=0,

    @notify_level_email=2,

    @notify_level_netsend=2,

    @notify_level_page=2,

    @delete_level=0,

    @description=N'sends email to DBAGroup after SQL Agent starts',

    @category_name=N'Database Maintenance',

    @owner_login_name=N'sa',

    @notify_email_operator_name= @operator,

    @job_id = @jobId OUTPUT

    select @jobId

    EXEC msdb.dbo.sp_add_jobserver@job_name=N'restart_ALERT'

    , @server_name = @servername

    EXEC msdb.dbo.sp_add_jobstep@job_name=N'restart_ALERT'

    , @step_name=N'send email'

    , @step_id=1

    , @cmdexec_success_code=0

    , @on_success_action=1

    , @on_fail_action=2

    , @retry_attempts=0

    , @retry_interval=0

    , @os_run_priority=0

    , @subsystem=N'TSQL'

    , @command= @commandstage

    , @database_name=N'master'

    , @flags=0

    EXEC msdb.dbo.sp_update_job @job_name=N'restart_ALERT',

    @enabled=1,

    @start_step_id=1,

    @notify_level_eventlog=0,

    @notify_level_email=2,

    @notify_level_netsend=2,

    @notify_level_page=2,

    @delete_level=0,

    @description=N'sends email to DBAGroup after SQL Agent starts',

    @category_name=N'Database Maintenance',

    @owner_login_name=N'sa',

    @notify_email_operator_name= @operator,

    @notify_netsend_operator_name=N'',

    @notify_page_operator_name=N''

    DECLARE @schedule_id int

    EXEC msdb.dbo.sp_add_jobschedule@job_name=N'restart_ALERT',

    @name=N'on SQL Agent start',

    @enabled=1,

    @freq_type=64,

    @freq_interval=1,

    @freq_subday_type=0,

    @freq_subday_interval=0,

    @freq_relative_interval=0,

    @freq_recurrence_factor=1,

    @active_start_date=20151222,

    @active_end_date=99991231,

    @active_start_time=0,

    @active_end_time=235959, @schedule_id = @schedule_id OUTPUT

    select @schedule_id

    GO

  • For the Cluster failover. Would setting up an Alert that fires based on Error number 1480 be easier? That should catch the role change, and email the Operators.

  • I would also add history table (this is what I did/added):

    CREATE TABLE [dbo].[ClusterNodesHystory]
      (
         [ID]          [int]            IDENTITY(1,1)     NOT NULL CONSTRAINT [PK_ClusterNodesHystory_ID] PRIMARY KEY CLUSTERED ,
         [CurrentNode] [varchar](100)                 NULL,
         [PreviousNode] [varchar](100)                 NULL,
         [TimeStamp]      [datetime]                     NOT NULL     CONSTRAINT [DF_ClusterNodesHystory_TimeStamp] DEFAULT (GETDATE()),

      )
    GO

    and then as last part of the script:
            INSERT INTO [dbo].[ClusterNodesHystory]([CurrentNode],[PreviousNode])
             SELECT @curnodename,@prevnodename;

    so that way we have full history records

  • good .
    same way can you please post code  to the get the alert when  Always on server changed primary server to secondary server

Viewing 15 posts - 1 through 15 (of 16 total)

You must be logged in to reply to this topic. Login to reply