Error checking in SQL Jobs

  • Due to some data refresh and application requirements, we need to roll a database forward to our production server on a nightly basis. The issue is that since we now have some many connections getting to the DB, we can't gain exclusive access to the DB to run the restore. Then the DB gets stuck in Single User Mode and our applications get hosed. I wrote error checking code to try and prevent this from happening, but I guess I didn't write the code correctly because it didn't work.

    There are a total of 4 steps to the job, and I think one of the steps might be an issue, not sure, so looking for insight, and that is outside of the error checking issue.

    Any insight would be appreciated. Thanks!

    Step 1: Kill Pids

    Step 2: Change to single user mode

    Step 3: Kill Pids again, in case there are any one hanging out there (this might be an issue, as it kills the pid that runs the job itself, so it seems.

    Step 4: Restore the DB. Here is my code. There is a move and some sync users code in there, but this give you the idea of what I was attempting to do.

    use master

    go

    IF @@ERROR <> 3101 --Error 3101 is for not obtaining exclusive access to DB.

    BEGIN

    RESTORE DATABASE [XXX] FROM DISK = N'\\mynetworkpath_mydb.bak' WITH FILE = 1,

    END

    ELSE

    BEGIN

    ALTER DATABASE [HRSADW] SET MULTI_USER WITH ROLLBACK IMMEDIATE

    END;

  • Instead of changing it to single user mode, what about changing it to restricted user mode. Then only sa or dbo will be an issue and if you've set up security right, your users shouldn't have that level of access.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • Hi thanks - that will help with one of the issues we are having but I guess I didn't ask my question correctly.

    What I am trying to understand is why my error checking code isn't preventing the DB from getting stuck in Single User mode?

  • Cause in the microsecond from you finishing killing all the sessions and then setting the database to single_user, another connection came in and took it over.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • Hi -

    One more follow up question I am trying to understand how jobs really work. When I had this job as 'all in one step' it worked most of the time. but then I thought based on the increase in failures I was getting that I needed to redesign it. So I decided to redesign it and break it into 4 steps inside the Job. With it broken into 4 steps, and with making the suggested change, it still had the same issue, it couldn't kill the PIDS and would then fail.

    So now, I thought, maybe put all the code into one step, if that code fails the job fails. But I don't understand how the change I made impacts how SQL Server Jobs work? Why does a job with multiple steps fail, but the same code in one step works?

  • If by working and not working you mean that putting in single user mode sometimes works better with a single batch than with multiple batches it's down to time and processing power on the CPU. If you multiple batches, there is just more time between commands than if we're talking about all the commands in a single batch. But, even in a single batch, there is still time in between commands. It's that time in between that you get a connection slip in on you and the single_user doesn't work.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • Hi -

    OK so when I say step(s) you say batch(s) because that is how SQL executes it as batches. As far as the Single User Mode, I took your suggestion and changed it to RESTRICTED.

    All that being said, since our own IIS server is creating the connections that don't allow the restore to happen, if it is in Multiple Batch mode, I tried turning off the offending App Pools via timer a minute before the SQL Job runs. Then 6 minutes later, the App Pools get turned back on. It is kind of clumsy because that job runs on a separate server inside Windows Tasks, but based on my limited knowledge of getting SQL Server A to talk to Web Server Z, that is the best I can come up with.

    But what you are telling me about the execution of batches on SQL makes sense. Each batches timing is based on how long that server takes to run it. but if the batch is 'all inclusive' the timing is less, therefore the connections don't have as much time to re-establish themselves.

    Since this is a production environment that I can't mimic it is hard for me to test all of this. But last night when the job ran at 1 AM I happened to be awake so I checked it. It ran successfully with the modifications I made. Gave a few errors of PIDS not being available to be killed, but that maybe because they were already killed and my Proc has them stored in a temp table to iterate through them, and when the execute of the KILL statement happens might have to be reviewed.

    Thanks for the help.

  • I'm not a fan of killing sessions, but, instead of loading them into a temp table and then, presumably, using a cursor to clean them out, what about just using WITH ROLLBACK_IMMEDIATE on the SET DATABASE command. You can read more about it here. It's going to be much, much more efficient. But, be warned, it's going to be very efficient. Setting it by accident causes all sorts of problems.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • ok since our connections are read only web connections killing sessions doesn't hurt anything, IMHO. I do use the roll back immediate in my code to alter the DB, but if I can't get exclusive access to it, how can I alter it? that is the crux of the issue?

  • You shouldn't have to get that for restricted user in order to use the immediate rollback.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • Another option for preventing connections to the DB so you can restore in peace, would be to set it to offline, rather than single_user or restricted_user.

    See this topic:

    "What can I do to an offline DB?"

    I've been using this method to refresh QA DBs here at work and it's been working like a champ. Still need to go in and check for orphaned users, but that only takes a couple minutes (and likely could even be scripted out.)

    Jason

Viewing 11 posts - 1 through 10 (of 10 total)

You must be logged in to reply to this topic. Login to reply