AlwaysOn transaction error 9002: the transaction log for database '' is full due to 'AVAILABILITY_REPLICA'

  • I have proper full and trans backup jobs set for both AlwaysOn Availability Group replicas. The issue is one the databases fills up the transaction log drive and trans log backups, though they appear to succeed, don't free up the space, when manually trying to shrink the log I get 'the transaction log for database '' is full due to 'AVAILABILITY_REPLICA''

    This was supposeldy a bug in 2012 as per https://support.microsoft.com/en-us/kb/2922898 but I am running 2014 SP1, any thoughts please?

  • On which server are you doing the tran log backups and the full backups- primary or replica?

    I had issues on sql server 2012 until I ran the backups on the secondary, and did the shrink on the primary.

    And of course you need to check the minimum size for the log file, to confirm it is set to the number you are shrinking the log file to.

    Sorry if I stated the obvious, just telling you what I had to do.

    Hope this helps.

  • Both full and trans log are done on the primary, as far as I understand full backups need to be done on the primary and 'copy-only' backups can be done on the secondary but not true fulls. How exactly do you have the backups set up? Are you needing to shrink the logs constantly?

  • We have one primary and one secondary in the HA cluster, and do the full and tran log backups on the secondary, but did the shrinks on the primary, because the secondary is read-only so we couldn`t shrink there.

    I only had to shrink the logs once or twice, after I changed the minimum log file size for DBs, since it wouldn`t shrink to less than the minimum size set before I started here.

  • That indicates that data movement for that database from the primary to the secondary is not occurring, either because it's disconnected or movement has been suspended.

    I'd check to make sure that that database is actually synchronized. If it's disconnected or suspended you'll have to address that before the log can be truncated.

    Cheers!

  • All databases are synchronized and everything is apparently working, that's why this is so frustrating.

  • That is interesting. When run from the primary, what does this query show for the database in question on each replica?

    SELECT arcs.replica_server_name, d.name, d.log_reuse_wait_desc, drs.log_send_queue_size,drs.redo_queue_size

    FROM master.sys.databases d

    INNER JOIN master.sys.dm_hadr_database_replica_states drs

    ON d.database_id=drs.database_id

    INNER JOIN master.sys.dm_hadr_availability_replica_cluster_states arcs

    ON drs.replica_id=arcs.replica_id

    ORDER BY name ASC

    Cheers!

  • If you just go by SSMS saying synchronized on both servers, then please make sure you hit refresh to verify they really are synchronized.

    I like to check in the server logs on both sides and through the dashboard periodically, to be sure of the real status.

  • The database on both servers for 'log_reuse_wait_desc' is showing 'AVAILABILITY_REPLICA'. I have already tried removing the database from the Availablity Group, readding it but then it goes back to this state. Synchronization and cluster health are all good.

  • I wasn't so much interested in the log_reuse_wait_desc, as that's just getting pulled from the primary. I included it just to be sure about that piece. I was more interested in the queue sizes.

    Cheers!

  • SERVER-ADBAVAILABILITY_REPLICANULLNULL

    SERVER-BDBAVAILABILITY_REPLICA00

  • Bumping this. I am still receiving the 'availability_replica' errors and backups do not resolve this, I need to remove the database from the availability group first and re-add it to get it to "clear" this. Thoughts?

  • Did you try to explicitly issue a CHECKPOINT on the primary? What happens then?

    Also, if its not a prod box, try running this

    EXEC sp_repldone @xactid = NULL, @xact_segno = NULL, @numtrans = 0, @time = 0, @reset = 1;

  • chrisph (9/23/2015)


    Bumping this. I am still receiving the 'availability_replica' errors and backups do not resolve this, I need to remove the database from the availability group first and re-add it to get it to "clear" this. Thoughts?

    Let's start with a little info on your setup

    how many secondary databases?

    are log backups secondary bound, primary or any?

    what exact version of sql server?

    -----------------------------------------------------------------------------------------------------------

    "Ya can't make an omelette without breaking just a few eggs" 😉

  • @Jackin, how do you force a checkpoint?

    @perry, 4 databases, one large one is causing the errors.

    Full and log backups are set to primary, nightly fulls, 30 min tran logs.

    SQL2014 Enterprise SP1.

Viewing 15 posts - 1 through 15 (of 21 total)

You must be logged in to reply to this topic. Login to reply