Mirroring Strange Behaviour Question

  • Hi all

    Last night one of our mirrored databases failed over and then failed back automatically. This seems strange behaviour to me as no other mirrored DB failed over with it. The messages I got in order were:

    Database mirroring is inactive for database 'xx'.

    Starting up database 'XX' (for all databases but the one in question)

    The mirrored database "XX" is changing roles from "PRINCIPAL" to "MIRROR" because the mirroring session or availability group failed over due to role synchronization. This is an informational message only. No user action is required.

    Following this was the recovery of all the other databases but this one.

    Later on we got a different problem as it looks like we had a network issue and was getting the following:

    The mirroring connection to "TCP://xxx.xxx.xxx:5022" has timed out for database "xxx" after 10 seconds without a response. Check the service and network connections.

    Database mirroring is inactive for database 'XX'.

    The mirrored database "XX" is changing roles from "MIRROR" to "PRINCIPAL" because the mirroring session or availability group failed over due to automatic failover

    I am a little confused why this particular database acted different from all the others, and also what caused the initial failover. The secondary failover make more sense as there was a loss of connection to the mirror hence the DB failed back to the Principle.

    Thanks for any hints

  • what is the timeout value you are using for this database?

    select mirroring_connection_timeout from sys.database_mirroring

    where database_id = DB_ID('yourdb')

    -----------------------------------------------------------------------------------------------------------

    "Ya can't make an omelette without breaking just a few eggs" 😉

  • They are all at 10sec, I guess I could increase it, but was really wondering why it happened in this manner in the first place.

  • Commonly a network error, what testing did you perform to ensure the default of 10 secs was sufficient to mitigate potential issues on your network?

    -----------------------------------------------------------------------------------------------------------

    "Ya can't make an omelette without breaking just a few eggs" 😉

  • The only testing I did was to monitor the network interfaces for a week or so to see how much they were being utilised, and found that they were ok.

    Am open to any suggestions however.

    Thanks

  • Hi all - This has happened again, I can only assume its a network issue at that time, something thats hard to capture unless I know when its happening.

    Its strange that no one on the net has ever seen the exact same errors as me though.

  • I've been seeing this for quite a while on a hokey test SQLServer VM.

    It doesn't happen on the production machines though.

    Thanks for being the 1st to raise this 🙂 I too have had difficultly tracking this down.

    I'm going to increase the timeout and see what happens.

Viewing 7 posts - 1 through 6 (of 6 total)

You must be logged in to reply to this topic. Login to reply