Best way to replicate tables from Server A to Server B

  • I've seen some really excellent posts in this forum for people asking questions about the particular type of replication needed, and as everyone basically has different requirements, I thought I would post mine up here too!

    I have two identical servers - Server A and Server B.

    Server A is a live database server and is tied to a web application, it's busy.

    Server B is doing nothing. I would like to copy the 5GB database on Server A to Server B so we can use it for certain READ ONLY reporting lookups, thus taking (some) heat off Server A. Every little helps.

    I need to make sure that Server B is in sync with Server A - every hour!

    I would like to access the 'copy' of the database on Server B at all times (even during the copy period) but I am never writing to Server B, just reading.

    If I add a table or remove a table from the database on Server A, I would like an easy process to update the replication to include / exclude them from the replication.

    Now - on first glance I thought that database mirroring was my man. But then I read somewhere that with Mirroring, the mirror database isn't actually 'usable' as a live database in it's own right - meaning that I could not perform queries against it like a typical database - is this correct?

    Log shipping also does not provide me with a live database (so I have read) and also you can't select / de-select tables, it ships the lot.

    So that leaves me with replication doesn't it?

    In which case, for my scenario - is a one way transactional replication the best solution? Snapshot 5GB every hour is going to be painful.

    Whilst a transactional replication is in process, can users still run reports against Server B?

    There is no failover requirement here, just moving the entire DB to Server B for 24 hour reporting stuff.

    Cheers,

    O.

  • I would set up transactional replication to server B syncing every 5 minutes. That way you are not going to get sudden huge spikes in traffic just a series of small ones throughout the day.

    If Server A is stresed then I would set up the subscription as a PULL subscription so Server B does most of the work.

    After setting up the initial publication/subscription I would check to see if the subscription is @sync_type 'Automatic' and if so drop the subscription (sp_dropsubscription and re-add it as a @sync_type='none'. This tells the server that the schema and objects are already at the subscriber.

    I would also amend the DELETE stored procedures are the subscriber end. The will be called sp_MSDel... You probably want to block deletes as you probably want to retain more data at your subscriber than is currently at your publisher. The company I work for keeps between 3 and 5 days worth of data LIVE and 30 days at the reporting end.

    It is also worth amending the sp_MSIns stored procedure so they do an IF EXISTS() check to see if the primary key is already at the reporting end. If you ever have to resync this prevents a whole lot of woe.

    You need to be careful when you generate the initial snapshot as the end process of a snapshot is to lock the tables at the publisher end while the system checks the integrity of the snapshot.

    Where I work we also make the distinction between configuration tables (lookup tables) and persistence tables (things like orders, enquiries etc).

    We may have two publications. One for configuration data and one for persistence. This gives us some freedom in how we configure replication.

    Its not as complicated as it sounds.

  • Log shipping has a read only option that i use for report servers

    The issue is will kick the users out to restore the database

  • Right I need to be able to use my database copy whilst the replication is in process.

    Am I right in saying that Transactional Replication is the only way to achieve this?

    O

  • I don't know about it being the "only" way but it is certainly what I use.

    I have a reporting server and data is constantly being replicated down to it. After all, at the subscriber end all that is happening is a bunch of simple CRUD stored procedures is writing data into tables.

  • Otis (2/26/2008)


    Right I need to be able to use my database copy whilst the replication is in process.

    Am I right in saying that Transactional Replication is the only way to achieve this?

    O

    database mirroring with DB SNAPSHOTS is another way.

    Personally, I preffer replication.


    * Noel

  • Thanks Noel,

    I hadn't thought of that - however, the problem is that - from what I understand in this thread, that snapshots require a lock on the source database at the end of the snapshot to verify. As the brief is to be in sync at least to the hour, and it's a 5GB DB - a lock every hour on a live web application, isn't really going to be much fun.

    So I think the answer is transactional replication, but rather than do it every hour, do it in small drips every 5 mins - and have all this running from the Pull server to save CPU on the source.

    But again thanks for your reply every post on here has been useful...

    O

  • OK now I'm looking at this as a two server system, the publisher is very busy, the subscriber is doing nothing.

    I have seen in various posts that generally say it's recommended to have Publisher, Distributor, Subscriber on different servers. Some articles state that due to resource constraints you can stick your distributor on your publisher, and then PULL from your subscriber.

    But there doesn't seem to be anything to suggest that you can stick your distributor on your subscriber.

    Is this possible, based on a two server solution, and that the subscriber is the same spec as the publisher - just doing nothing, surely putting the distributor on the subscriber is the best bet?

    But is it possible? Anyone doing this?

    O

  • "But there doesn't seem to be anything to suggest that you can stick your distributor on your subscriber."

    You are completed right. You can do that, the distributor takes the data from the transaction log and put it inside the distribution database. The only problem with the 2 server architecture is that if the Distributor and the Publisher are not in the same server you have the risk of the distributor going offline and the transaction log filling the hard drive in the publisher since the log won't be truncate if transactional replication is setup. But anyway Microsoft recommends 3 servers and 1 for the distributor, so the same can happen.

    Anyway I recommend transactional replication with push subscription, and put the distributor and suscriber in the same server. (when dist and susc are in the same server push or pull is the same)

    good luck.

    Pedro R. Lopez
    http://madurosfritos.blogspot.com/[/url]

  • Thanks I have one last little question, which probably sound a bit dumb - but I'm new to this...

    In a transactional replication setup and say, your publisher needs to be restarted, or goes offline for whatever reason.

    When this server comes back online - does it just merrily carry on as it was before, publishing to the distributor, without any user intervention - or do you have to drop the subscriber database, publish a new snapshot, pull/push snapshot from/to subscriber .... basically set everything up again. Each time you restart your pubisher?

    Of course the hope is this server is not going to be taken down at all - but for service packs etc. I guess it will happen.

    Clearly you can see I have not tested this, but before I do - what should I except?

    I guess the 'carry back off where it left off' situation is too much to ask right!?!

    O

  • Restarting the publisher is not a problem.

    Restarting the distributor or subscriber may have an affect depending on how long they are down for.

    If they are down for longer than you have configured the expiry time for your publication then you can either reinitialize your publication or (as I do) drop and recreate the subscription then manually resync the missing data. On large databases this is much quicker than reinitializing everything.

  • Guys, I am setting up a DR database and have a question about the distribution database. In our scenario there are already replication publications on the existing server. I would like to offload processing from this main server for the new replication pieces that I will be putting into place and saw earlier in the thread that someone recommended to have server "B" in this instance actually pull the data to reduce the workload on server "A". Does that mean however that the distribution database is still located on server "A"? If so, is that a big deal and will that distribution database cause issues with performance or will the fact that setting up a pull subscription be enough to take away the workload of the replication?

    Thanks for the great topic and assistance!

  • Hi all,

    Thanks for the help on this replication thread, I have a test system running and all seems well. This two server setup - I needed to take as much heat off the Publisher, so this meant putting the distributor and subscriber on the same (idle) server. Haven't seem any comments which suggest this is a bad thing.

    My next worry is 'managing' the replication after it's started - mainly watching out for 'gotchas' like changing certain parts of the schema, SPs, PK columns etc. It's hard to find SQL 2005 specific details on these 'rules' and as SQL 2005 had major changes in what you are allowed to change in the schema (allot more) it would be very handy if someone knew of a good article which list out the can and can't dos.

    The article on this site 'how to break replication' was very informative - more of that stuff!

    Any recommendations?

    Otis.

Viewing 13 posts - 1 through 12 (of 12 total)

You must be logged in to reply to this topic. Login to reply