• muthyala_51 (4/19/2013)


    But increasing the response time might not give us the actual root cause why it happened.

    I am looking more into I/O error what we received- looks to be DISK I/O issue. I have ran the Perfmon counter and saw that the Avg DiskSec/Transfer is >0.015 seconds during File copy

    Also noticed during the File copy of file size around 4GB to the one of the disk drives- the SQL server got hang and everything was frozen for couple of minutes and the status of Database on Mirror server were in (Disconnected/In recovery mode), they came to normal state after few minutes. Can you direct me on this? Thanks.

    Root cause? Your principal and witness servers were unable to communicate during the timeout period, resulted in the witness making a determination that the prinicapl server was down and initiated a failover to the mirror.

    Why? Not enough network bandwidth to communicate due to large data transfer(s) occuring.

    Once again, I had this issue at a previous employer, the resolution was to increase the timeout period before a failover occured. This solved the issue of our somewhat instable network causing a failover when there really wasn't a problem. Our automatic failover worked fine when there were real problems with our servers.