February 24, 2009 at 3:45 am
Hello,
I have set up a test log shipping solution to learn how to implement and administer it before implementing one in our live production environment. It is a conventional and simple implementation involving three servers, the primary, secondary and a monitor server all of which are separate servers (no server is doing more than one role).
So far I am happy with my understanding of the solution and have learnt about it as i have progressed through the implementation. I've been running through various test scenarios (primary outage, secondary outage etc) to ensure that I have a good understanding of the various components, what issue could arise and how they manifest themselves.
It is at this point I have hit a peculiarity. When I stop the sql agent on the secondary server to replicate an outage, I am expecting a 14421 error to appear in the sql log on the monitor server once the restore threshold has been exceeded. From what I understand the message should advise along the lines of the secondary database on the secondary server has not been restored for xx period. What I am in fact getting is a 14420 error message (see below), which references the secondary sever by name but has the message text that I would expect from the primary server if the backup threshold is exceeded.
The log shipping primary database server2.test_database_1 has backup threshold of 30 minutes and has not performed a backup log operation for 32 minutes. Check agent log and log shipping monitor information.
situation:
- The primary server has not been touched and is still doing the backups fine (i get the correct message for outage on this server)
- The secondary server and database are not set up as a primary for any other copies (i checked, just to be sure)
- the sys.logshipping tables all appear to have the correct references in them for the primary and secondary servers
The only unconventional setup in this topology is that the secondary server is a virtual server (VMWare), but given my experience with them so far I would be surprised if that is causing the problem.
I have trawled the internet and search forums trying to find any record of anyone else reporting the same issue with out any success.... any help would be appreciated.
Thanks
Dave
February 24, 2009 at 4:14 am
What's frequency of log backup?
---------------------------------------------------
"Thare are only 10 types of people in the world:
Those who understand binary, and those who don't."
February 24, 2009 at 4:15 am
I have an update, all of the following is on the monitor server:
- I found that the alerting SP is [sys].[sp_check_log_shipping_monitor_alert]
- then I broke down the SP to find that the error being called for the secondary was dependant on the value in msdb.dbo.log_shipping_monitor_secondary, and for the error called attribute threshold_alert
- the value in there for the threshold_alert was 14420, which was weird as i think that should be 14421
- i ran the following to update the value:- update msdb.dbo.log_shipping_monitor_secondary set threshold_alert = 14421
The correct error message is now firing from the monitor when relating to an exceeded restore threshold on the secondary server 🙂
Even though the problem is now solved, I would appreciate any feedback of anyone else who had the same issue. Can anyone advise if this is a known bug or did I do something wrong in the initial log shipping setup wizard?
cheers
dc
February 24, 2009 at 4:28 am
I guess in your case backup threshold value is greater than log backup intervel.
---------------------------------------------------
"Thare are only 10 types of people in the world:
Those who understand binary, and those who don't."
February 24, 2009 at 4:39 am
?
Hi, I think you may be confused as to what I'm trying to articulate :ermm:. The problem wasn't that i was receiving alerts (they are firing at the specified thresholds) the problem was that the alert for the secondary was firing the wrong error (14420, instead of 14421).
I have found that there appears to be a bug in the wizard because if amend the alert threshold the value for the threshold_alert resets to 14420 (it should be 14221). I have run a profiler on the wizard and am currently looking though what it does to confirm my suspicions.
February 24, 2009 at 5:58 am
DC, you bang the target profiler will tell you everything 😉
---------------------------------------------------
"Thare are only 10 types of people in the world:
Those who understand binary, and those who don't."
Viewing 6 posts - 1 through 6 (of 6 total)
You must be logged in to reply to this topic. Login to reply