multiple dbs + log shipping + marked transactions

Question

Post reply

multiple dbs + log shipping + marked transactions

DataDog

Hall of Fame

Points: 3305
More actions
July 12, 2011 at 1:03 pm

#243888

hi
I have 10 databases (on 1 server instance) that are cross-dependent (ie: transactions commonly span databases)
all use a home-made log shipping solution that creates tlogs every 5 minutes (and sends them to multiple local and remote locations)
AFAIK There is no way to restore multiple databases without using marked transactions
because stopat=time is not reliable across multiple databases
So I have a job that runs a pointless marked transaction across all 10 databases every 1 minute*
the mark is a utc timestamp (201107121957) rounded to the nearest minute
* or longer - the new timestamp must be at least 1 minute greater than the last
This allows me to restore a consistent set of databases and bring a standby server online
with between zero and 5 minutes of dataloss - but with guaranteed consistency
What I want to know is:
a) how is this problem usually handled?
b) are there any 3rd party products that handle consistent multidatabase backup and restores (I can't find any)
c) is there any way to list the marks in a given tlog ? (I use a rounded timestamp so we can guess if needs be)
d) any reason why this wouldn't extend across multiple server instances (marked transactions with DTC?)

Viewing 15 posts - 1 through 15 (of 21 total)

You must be logged in to reply to this topic. Login to reply

bopeavy SSCertifiable Points: 5869 More actions · Answer 1

What is the purpose of doing this?

:cool:

"There are no problems! Only solutions that have yet to be discovered!"

Jayanth_Kurup SSC-Insane Points: 22967 More actions · Answer 2

I think replication or DB mirroring would be a good fit in the above scenario.

However I think you would suffer significant data loss in your current model because fail over needs to happen ASAP after the actual issue else you run the risk to having to stopat a checkpoint thats too far back.

If this is for a prod environment or you have the budget then clustering would be an option.

Jayanth Kurup[/url]

DataDog Hall of Fame Points: 3305 More actions · Answer 3

bopeavy (7/12/2011)
What is the purpose of doing this?

If you don't do it then you can't restore your production databases to a consistent state.

As it is impossible to backup multiple databases at the same time - you have to use logs.

And without marked transactions it is impossible to restore a *set* of databases to a common "point-in-time" as the "stopat" feature is only consistent within a single database.

AFAIK if you have multiple cross-dependent databases you have to do this or you can never recover from a disaster properly.

DataDog Hall of Fame Points: 3305 More actions · Answer 4

Jayanth_Kurup (7/12/2011)
I think replication or DB mirroring would be a good fit in the above scenario.
However I think you would suffer significant data loss in your current model because fail over needs to happen ASAP after the actual issue else you run the risk to having to stopat a checkpoint thats too far back.
If this is for a prod environment or you have the budget then clustering would be an option.

DB mirroring doesn't support multiple lan and wan targets and has no rewind facility - I can't see the point in it.

Replication is worth a look but I am concerned it is overly complicated and fragile and might not survive version upgrades, whereas my log shipping system has survived from 2000, 2005, 2005 x64 and 2008 R2 without alteration.

Clustering is no use either.

Disaster recovery requires the ability to recover from all kinds of disaster - data corruption, system failure, etc.

None of those solutions is capable of full DR - only partial.

I can't see any risk of "significant data loss" as I can restore the whole system to any 1 minute mark in the last year (from tapes if need be)

and have access to stopat-precision restores for reading data etc.

Gail Shaw SSC Guru Points: 1004504 More actions · Answer 5

Why is stopat = time not reliable across multiple servers if you want to stop at a point in time?

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

DataDog Hall of Fame Points: 3305 More actions · Answer 6

GilaMonster (7/13/2011)
Why is stopat = time not reliable across multiple servers if you want to stop at a point in time?

Microsoft explain it better than I can:

http://msdn.microsoft.com/en-us/library/ms187014.aspx

Gail Shaw SSC Guru Points: 1004504 More actions · Answer 7

Link doesn't explain anything.

If all you're doing is putting a mark in the log at a specific time, how is that any different to restoring with stopat that specific time? I can understand if the consistent point of the related databases is not at the same time, transactions start on different DBs at different points and finish at different points and you want to restore to before or after, but that's not what you're doing here.

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

DataDog Hall of Fame Points: 3305 More actions · Answer 8

the link states in a big yellow box with the word "important":

"You can recover related databases only to a marked transaction, not to a specific point in time."

as for why:

why should time have any relevance to consistency between log files?

the sql server datetime type is only precise to 3.333 milliseconds

this is true for databases on a single server even more so for databases on multiple servers

stopat is just a convenience for single database restores

stopatmark is required for multi database restores

if you aren't making regular marks - you can not restore correctly

(edit) other reasons why time is unsuitable

- time isn't lockstep even between software threads of the same core, let alone different cores, different cpus and especially not different servers

- consider a typical sql box with multiple cpus, each with multiple cores, each with hyperthreading and then with multiple software threads also ...

what is the chance that timestamps on log writes to two different databases are going to synchronize? (much) less than 100%

so it is unusable

DataDog Hall of Fame Points: 3305 More actions · Answer 9

GilaMonster (7/13/2011)
If all you're doing is putting a mark in the log at a specific time, how is that any different to restoring with stopat that specific time?

the difference is that stopat times are not equivalent across different database logs from the same server

which means you will be getting an inconsistent restore

using stopat in this way can result in a transaction, that originally spanned two databases, being rolled back in one database and committed in the other ...

Gail Shaw SSC Guru Points: 1004504 More actions · Answer 10

If you were marking the transaction that spans the databases I'd agree.

Just having schedules transactions in each individual database every 10 min that mark the log and do nothing else don't make sense. That 10 min schedule is no more or less accurate than the stop at time, the marked transaction can, like with the stop at, cause a transaction across 2 DBs to roll back in one and commit in the other.

Basically I'm not clear why a marked transaction run at exactly 10am is going to give a more consistent restore than a restore stopat 10am when both are prone to the vagaries of time granularities.

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

Jayanth_Kurup SSC-Insane Points: 22967 More actions · Answer 11

Disaster recovery requires the ability to recover from all kinds of disaster - data corruption, system failure, etc.
None of those solutions is capable of full DR - only partial.

There is no one solution fits all . A Good DR plan has Clustering or DB mirroring to recover from OS level and network level failures , RAID and SAN to recovers from disk and data corruption and Log shipping , backups and recovery etc to bring databases online in a DR , All of them used correctly is what makes a good DR plan.

The data loss I was referring to; was a restore to any point in time means that data entered beyond that time is lost.

While it's important for you to bring all databases to a consistent state, what about data in other databases which didn't need to be failed over ?

Besides log shipping is not automatic fail over so you need to still make sure the server is monitored real time to avoid having to restore too far back.

Replication is a tested solution and works well for most cases. The fragility of replication only comes in when the underlying articles are changed too frequently or when logs get choked on the network.

Jayanth Kurup[/url]

Jody.WP Mr or Mrs. 500 Points: 518 More actions · Answer 12

HI ,

hope this helps

http://social.msdn.microsoft.com/forums/en-US/sqldisasterrecovery/thread/53430564-2817-4921-a990-7f84a01fc141/

regards

Jody

DataDog Hall of Fame Points: 3305 More actions · Answer 13

GilaMonster (7/13/2011)
If you were marking the transaction that spans the databases I'd agree.
Just having schedules transactions in each individual database every 10 min that mark the log and do nothing else don't make sense

Maybe I didn't make that clear when I said:

So I have a job that runs a pointless marked transaction across all 10 databases every 1 minute*

note "a ... transaction" (singular) and "across all ... databases" (plural)

I am marking all databases in a single transaction.

DataDog Hall of Fame Points: 3305 More actions · Answer 14

GilaMonster (7/13/2011)
Basically I'm not clear why a marked transaction run at exactly 10am is going to give a more consistent restore than a restore stopat 10am when both are prone to the vagaries of time granularities.

Because "10am" is ambiguous - time is just an estimation - each database/log estimates it differently

whereas an entry in N physical logs really synchronizes those logs - the time of the event is just a convenience to help us manage things

A "point-in-time" for multiple related databases is really a set of LSNs, one per log, which says "all these databases are consistent NOW"

there are two "points-in-time" you can use per mark: stopatmark, stopbeforemark

It isn't about accuracy - it is about consistency

AFAIK there is no way to perform arbitrary high precision point-in-time restores with related databases on SQL server

you have to pick your mark frequency and if there is important data between marks you have to rebuild it manually (for instance by restoring the database in parallel and copying records or by rerunning the application etc)

If that is what you want then you have to merge all databases into one ... then you have 1 log and "stopat" works reasonably well

the limitation is the 3.33 millisecond precision of the datetime type, there could be 100 transactions in 3.33 milliseconds ...