Reinitialization of Subscription took 8 hours to recognize the snapshot was there?

  • I have Publisher A, with Database A on Server A. I have Subscriber Database X on Server X, Distributor is also on Server X.

    Publisher A has 15 publications (named 1 thru 15 for this topic) for Database A, all of which, are subscribed by Database X on Server X using Distributor on Server X. (Pull Subscription)

    Server A and Server X are at two different physical locations.

    ReplData share is on Server X using UNC as normal.

    Last week, I reinitialized and re-ran the snapshot for Publication 13. Snapshot Agent ran in less than 2 minutes. I verified that the snapshot files existed appropriately in ReplData share. Replication Monitor under Distributor to Subscriber tab said, the ever exasperating, but typical statement: "initial snapshot for publication is not yet available." Within about 5 to 10 minutes, the Distributor to Subsciption tab said, "Applying Script Pub 13.pre" and everything went swell until the subscript was reinitialized and working great. Took about 20 minutes for the entire process. No problems. I went home for supper.

    This week, I did the exact same thing; however, even though the snapshot was created, and was in ReplData, the "initial snapshot for publication is not yet available" statement sat in the Distributor to Subscriber tab for 8 hours, before the "Apply script Pub 13.pre" statment was actually executed.

    Now 1 hour into the 8 hours, I'm thinking there's a problem, so I'm sp_who-ing Server X, and I'm not finding anything that pops out (i.e. blocked replication spids - etc.). Since it's QA I finally called it a night and went home. This morning is when I discovered that low and behold, sometime 8 hours after the SnapShot was created successfully, replication "discovered" the snapshot and reinitialized the subscription.

    Can anyone, please give me the underlying architecture on what SQL Server is actually doing right after the Snapshot is finished? The only thing different between last week and this week that I can readily point to is that several other of the 15 publications were running about 2 hours behind in latency. Does the whole distribution process have to "catch" up on all the other processes prior to reinitializing the snapshot on Pub 13? or is something else at play? OR is there something I can query to figure out what the (*&)(*& is going on so that it's not such a big black box?

    Thanks,

    Lezza

  • Wow. No guru's wanted to tackle this one eh? That's why our motto here is:

    Replication...

    ...just say no.

    Lezza

  • This is the first time I have seen issues like this. The servers that are being replicated, is it connected through LAN or WAN? 2 Hr latency is very very bad. That means you have a big issue. Have you seen any network hiccups?

    The only way that the snapshot does not start immediately is if you have not reinitialized the snap shot. When ever I do a re snapshot of an article, I also manually try to restart the Snapshot agent for that Publication and check and make sure that the Distribution agent is running smooth.

    -Roy

Viewing 3 posts - 1 through 2 (of 2 total)

You must be logged in to reply to this topic. Login to reply