Make SQL Server Agent Jobs HADR Aware

  • Comments posted to this topic are about the item Make SQL Server Agent Jobs HADR Aware

  • Nice piece of code. We just manually added first step to all HADR-sensitive jobs to check whether the used database is primary (sys.fn_hadr_is_primary_replica) and to raise error if not. The error on this step then stops the entire job.

    The good thing is all jobs can be sync-copied to all servers and nothing needs to be changed on failover.

  • I believe implementing a "master control job" which automatically disables or enables jobs as needed is a far more elegant solution.

     

    I delivered a couple of webinars about it:

    https://eitanblumin.com/portfolio/how-to-hadr-your-sql-jobs/

    Accompanying blog post here:

    https://eitanblumin.com/2020/05/26/sql-jobs-based-on-hadr-role-next-level/

  • Any thoughts about using a separate server to host agent jobs and have that pole the AG to find out which is the primary and target that server? The majority of my agent jobs call stored procedures and since the databases would be replicated in the AG, what is here is there as well. When a given job executes it polls the AG, determines the primary and using dynamic SQL the procedure call is formed to point to the primary. In theory, this would keep the jobs pointed at the primary no matter which member is the primary at any given time. This sort of checking would be at the top of any given job step. In my head, this would work well, but it wouldn't be able to deal with a situation where a job is executing and a fail-over occurs during that job step. There could also be issues with things like file dependencies where an incoming file is on the primary, a fail-over occurs, and the file is not on the secondary. Of course, there are ways to try/catch this or even avoid working with things like files on any member and only read or right to a share or some other location. Does this seem plausible?

    Cheers

  • It's plausible but it sounds like way too much effort, too many moving points of failure and it would require significant ongoing maintenance effort.

    May I suggest a simpler, more elegant solution?

    https://eitanblumin.com/portfolio/how-to-hadr-your-sql-jobs/

  • @jakub.Janda, thanks for the comment. That also works.

    The idea here is to support some jobs that are runnable on a secondary.

  • @EitanBlumin, totally. I like your solution as well.

  • I saw this when it first came out and had meant to get back to it. It took me a while but I made it. 🙂

    Rather than a slightly complex process to adjust the jobs we added a preliminary step that seems to do the trick. All of the jobs that might need to exist for any db in the AG exist on each node. The below code is in the first step of each:

    If NOT EXISTS ( Select  Primary_replica
    From sys.dm_hadr_availability_group_states
    Where Primary_replica = @@servername
    )
    Begin
    Raiserror ( 'Replica %s is NOT Primary',16, -1, @@servername )
    End

    The step is set to "Quit the job reporting success" as its failure action.

    It's been working like a charm for us since 2012.

    Have a great day.

    Richard

Viewing 8 posts - 1 through 7 (of 7 total)

You must be logged in to reply to this topic. Login to reply