High-Availability Strategy to avoid rollback issues

  • Hi Folks,

    I dug around on this forum and searched and couldn't find exactly what I was wondering, which makes sense because it doesn't seem out there on the web as well.

    I'm the sole DBA at my company now, and I'm comparing different High-Availability solutions for SQL 2008, which we plan to move to sometime next year (we're on a mix of 2k and 2k5 right now). Most of our main production servers are still SQL 2000, against my will I might add!

    In the past we've had issues with a long-running stored procedure that calls a cross-server query to a remote table, runs a select against the values (and CASTS some of them), and essentially re-populates a table on our target db overnight, outside of prod hours. Classic ETL stuff. Just recently this SP has started to run a long time on us, like an hour or more outside its time span. None of the others do it, and the DB is in good shape as far as reindexing and updating stats regularly, and other general maintenance stuff. This is on a VM.

    What's nasty is that killing the transaction results in a rollback, and either letting it run or rolling back keeps what appears to be a table lock on a critical customer-facing table. This causes issues in one of our web apps.

    In looking at High-Availability for SQL 2008 versus 2000, I'm trying to find a solution that is more tolerant or perhaps lets us avoid these kinds of rollbacks in the first place. I'm probably daydreaming as I imagine that since this is internal to the engine a rollback in a clustered environment is still just as problematic as one in a single-server environment, but I'm bouncing the question here in case someone can think of something I missed with clustering, or can think of a high-availability solution that might help us avoid the fun that can happen when we initiate that ever-fun "kill [spid]" command.

    The objective is an environment more fault-tolerant that allows us to do maintenance as needed with little to no impact on the customer experience, and has better tolerance for say, a hanging stored procedure.

    Please don't make comments about how we need to troubleshoot our code; I assure you we have and sp's don't get simpler than this. I'm more interested in the bigger picture going forward.

    Much appreciated, everyone.

  • Anybody? 🙂

  • So have you managed to get to the bottom of what is making it run long?

    Would the proc actually finish if you just simply let it process long enough?

    Are you running this over linked servers?

    I can't help but think this might do better using something like SSIS (which you wouldn't have had access to in 2000), or - dump it out and pull it back in. Trying to do this directly with a remote server often causes issues once you get past a certain size.

    ----------------------------------------------------------------------------------
    Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?

Viewing 3 posts - 1 through 2 (of 2 total)

You must be logged in to reply to this topic. Login to reply