After sql cluster node started - 4hrs later sql DBI service terminated

  • During a ms patch maintenance window, 2 node sql cluster, 1 node shutdown and patched and rebooted, 2nd node shutdown and patched and booted.

    During startup all looked fine, then 4hrs later i get 2 sql services crash - see below

    Sql services are set to start manually - by cluster mgr.

    Any idea's why it would take so long to crash the services ?

    Could it have been a SAN issue for the sql errorlogs ??

    DB2

    6.02.22 system restart

    6.09.10 system started

    6.06.19 error cluster service did not shutdown properly

    after receiving a preshutdown control id 7043

    6.10.06 cluster service started

    DB3

    6.06.45 failover cluster db2 removed from cluster event id 1135

    6.29.07 system shutdown

    6.32.16 cluster service started

    6.32.20 sql reporting service started

    6:32:44 system started

    .

    .

    10.39.24 sql DBI service terminated

    with service-specific error %%17058 id 7024

    Could not open error log file 'Z:\MSSQL10_50.DBI\MSSQL\Log\ERRORLOG'

    10.39.39 sql DBI_RS service terminated

    with service-specific error %%17058

    Could not open error log file 'E:\MSSQL10_50.DBI_RS\MSSQL\Log\ERRORLOG'

    10.39.48 sql agent (DBI) failed sdue to sql server dbi failure

    10.39.57 sql agent (DBI_RS) failed sdue to sql server dbi_rs failure

    10.47.08 sql reporting services stopped

    10.47.12 sql reporting services started

    11:14:41 rebooted db3 again

    11:17:27 system up all fine

    SQL Startup params:-

    DBI_RS on 3p

    -dE:\MSSQL10_50.DBI_RS\MSSQL\DATA\master.mdf;

    -eE:\MSSQL10_50.DBI_RS\MSSQL\Log\ERRORLOG;

    -lE:\MSSQL10_50.DBI_RS\MSSQL\DATA\mastlog.ldf

    DBI on 2p

    -dZ:\MSSQL10_50.DBI\MSSQL\DATA\master.mdf;

    -eZ:\MSSQL10_50.DBI\MSSQL\Log\ERRORLOG;

    -lZ:\MSSQL10_50.DBI\MSSQL\DATA\mastlog.ldf

  • Do you have a job that rolls the error log at 10:39?

    Did something happen to the permissions on the log folder?

  • arnipetursson (7/25/2014)


    Do you have a job that rolls the error log at 10:39?

    Did something happen to the permissions on the log folder?

    "Did something happen to the permissions on the log folder?"

    I heard that permissions had changed some how, but after the second reboot, it was ok

  • hmmm ... it sounds like you have a 2 node active/active cluster here.

    if these were just OS patches and not SQL Server related shutting down SQL on a node, applying the OS patches and rebooting should have worked just fine one node at a time.

    did you find any other errors inside the Windows logs or the cluster logs ?

    RegardsRudy KomacsarSenior Database Administrator"Ave Caesar! - Morituri te salutamus."

Viewing 4 posts - 1 through 3 (of 3 total)

You must be logged in to reply to this topic. Login to reply