Log Reader agent fails

  • SQL 2K sp 3. first off i'm new to using replicatin so please bear with me.

    I have a replication setup on 3 servers pub dist, and subs. on my dist under rep mon under log reader I get a failure "The step did not generate any output. The step failed."

    This does not happen right away, I'll get several 5-7 transactions delivered and then the failure occurs. I can then right click on the agent and restart it and all is well for the next few transactions.

    If this problem has been beaten up before please forward me to that topic.

    thanks all

    John Zacharkan


    John Zacharkan

  • Might have found the answer, would appreciate any feedback though. On the Repl-LogReader job agent step 2 run agent under advance retry attempts I change this to 10 to mirror production, this seems to have elevated the problem of the agent just failing out altogether.

    I should also mention that this problem does not occur when both publisher and distributor are on the same server.

    John Zacharkan


    John Zacharkan

  • Dang it failed again, just pull the carpet out from under my feet!

    John Zacharkan


    John Zacharkan

  • Need to turn on logging. Add -outputverboselevel 2 - output c:\replinfo.txt to the job step for the log reader and try again. Then you can see why.

    Andy

    http://www.sqlservercentral.com/columnists/awarren/

  • Hi Andy

    Here's a copy of the replinfo.txt when the job fails.

     
    
    Microsoft SQL Server Log Reader Agent 8.00.760
    Copyright (c) 2000 Microsoft Corporation
    Microsoft SQL Server Replication Agent: YAPSTGVIRPUB01-APPORTAL-8

    Connecting to Publisher 'MySTGVIRPUB01.APPORTAL'

    Server:
    DBMS: Microsoft SQL Server
    Version: 08.00.0760
    user name: dbo
    API conformance: 2
    SQL conformance: 1
    transaction capable: 2
    read only: N
    identifier quote char: "
    non_nullable_columns: 1
    owner usage: 31
    max table name len: 128
    max column name len: 128
    need long data len: Y
    max columns in table: 1024
    max columns in index: 16
    max char literal len: 524288
    max statement len: 524288
    max row size: 524288

    [7/3/2003 9:22:02 AM]MySTGVIRPUB01.APPORTAL: select @@SERVERNAME
    [7/3/2003 9:22:02 AM]MySTGVIRPUB01.APPORTAL: sp_MSgetversion
    Status: 4096, code: 20024, text: 'Initializing'.
    The agent is running. Use Replication Monitor to view the details of this agent session.
    Publisher: {call sp_repldone ( 0x000ac8dc000005a6002b, 0x000ac8dc000005a6002b, 0, 0)}
    Publisher: {call sp_replcmds (500, 0)}
    Status: 4, code: 20051, text: 'Delivering replicated transactions'.
    Status: 4, code: 6002, text: 'Committed batch to distribution database.'.
    Publisher: {call sp_repldone ( NULL, 0x000ac8dc000010b70024, 0, 812)}
    Publisher: exec sp_replcounters N'APPORTAL'
    Status: 4, code: 22021, text: 'Batch committed. Batch consisted of 134 commands, 133 xacts. Last xact: 0x000ac8dc000010b70001, '0x000ac8dc000010b70024'.'.
    Publisher: {call sp_replcmds (500, 0)}


    ********************************** STATISTICS *********************************
    Execution time (ms): 23109
    Work time (ms): 22578
    Distribute Repl Cmds Time(ms): 828
    Fetch time(ms): 735
    Repldone time(ms): 32
    Write time(ms): 592
    Num Trans: 133 Num Trans/Sec: 5.890690
    Num Cmds: 134 Num Cmds/Sec: 5.934981
    *******************************************************************************

    Disconnecting from Publisher 'MySTGVIRPUB01'

    John Zacharkan


    John Zacharkan

  • Andy here's a copy of it when it works, albiet the pub and the distr are now on the same server

     
    

    Microsoft SQL Server Log Reader Agent 8.00.760
    Copyright (c) 2000 Microsoft Corporation
    Microsoft SQL Server Replication Agent: MySTGVIRPUB01-APPORTAL-8

    Connecting to Publisher 'MySTGVIRPUB01.APPORTAL'

    Server:
    DBMS: Microsoft SQL Server
    Version: 08.00.0760
    user name: dbo
    API conformance: 2
    SQL conformance: 1
    transaction capable: 2
    read only: N
    identifier quote char: "
    non_nullable_columns: 1
    owner usage: 31
    max table name len: 128
    max column name len: 128
    need long data len: Y
    max columns in table: 1024
    max columns in index: 16
    max char literal len: 524288
    max statement len: 524288
    max row size: 524288

    [7/3/2003 9:37:08 AM]MySTGVIRPUB01.APPORTAL: select @@SERVERNAME
    [7/3/2003 9:37:08 AM]MySTGVIRPUB01.APPORTAL: sp_MSgetversion
    Status: 4096, code: 20024, text: 'Initializing'.
    The agent is running. Use Replication Monitor to view the details of this agent session.
    Publisher: {call sp_repldone ( 0x000ac8dc000010b70024, 0x000ac8dc000010b70024, 0, 0)}
    Publisher: {call sp_replcmds (500, 0)}
    Status: 4, code: 20051, text: 'Delivering replicated transactions'.
    Status: 4, code: 20051, text: 'Delivering replicated transactions'.
    Status: 4, code: 20051, text: 'Delivering replicated transactions'.
    Status: 4, code: 20051, text: 'Delivering replicated transactions'.
    Status: 4, code: 6002, text: 'Committed batch to distribution database.'.
    Publisher: {call sp_repldone ( NULL, 0x000ac8dc000034c4003d, 0, 3922)}
    Publisher: exec sp_replcounters N'APPORTAL'
    Status: 4, code: 22021, text: 'Batch committed. Batch consisted of 424 commands, 418 xacts. Last xact: 0x000ac8dc000034c40001, '0x000ac8dc000034c4003d'.'.
    Publisher: {call sp_replcmds (500, 0)}
    Publisher: {call sp_replcmds (500, 0)}
    Publisher: {call sp_replcmds (500, 0)}
    Status: 4, code: 6002, text: 'Committed batch to distribution database.'.
    Publisher: {call sp_repldone ( NULL, 0x000ac8dc000035290001, 0, 375)}
    Publisher: exec sp_replcounters N'APPORTAL'
    Status: 4, code: 22021, text: 'Batch committed. Batch consisted of 1 commands, 1 xacts. Last xact: 0x000ac8dc0000350e0001, '0x000ac8dc000035290001'.'.
    Publisher: {call sp_replcmds (500, 0)}
    Publisher: {call sp_replcmds (500, 0)}
    Status: 4, code: 6002, text: 'Committed batch to distribution database.'.
    Publisher: {call sp_repldone ( NULL, 0x000ac8dc000036e9003a, 0, 203)}
    Publisher: exec sp_replcounters N'APPORTAL'
    Status: 4, code: 22021, text: 'Batch committed. Batch consisted of 9 commands, 9 xacts. Last xact: 0x000ac8dc000036d40001, '0x000ac8dc000036e9003a'.'.
    Publisher: {call sp_replcmds (500, 0)}
    Publisher: {call sp_replcmds (500, 0)}
    Status: 4, code: 6002, text: 'Committed batch to distribution database.'.
    Publisher: {call sp_repldone ( NULL, 0x000ac8dc00003a76003a, 0, 359)}
    Publisher: exec sp_replcounters N'APPORTAL'
    Status: 4, code: 22021, text: 'Batch committed. Batch consisted of 16 commands, 16 xacts. Last xact: 0x000ac8dc00003a760001, '0x000ac8dc00003a76003a'.'.
    Publisher: {call sp_replcmds (500, 0)}
    Publisher: {call sp_replcmds (500, 0)}
    Publisher: {call sp_replcmds (500, 0)}
    Status: 4, code: 6002, text: 'Committed batch to distribution database.'.
    Publisher: {call sp_repldone ( NULL, 0x000ac8dc00003aa90001, 0, 31)}
    Publisher: exec sp_replcounters N'APPORTAL'
    Status: 4, code: 22021, text: 'Batch committed. Batch consisted of 1 commands, 1 xacts. Last xact: 0x000ac8dc00003aa30001, '0x000ac8dc00003aa90001'.'.
    Publisher: {call sp_replcmds (500, 0)}
    Publisher: {call sp_replcmds (500, 0)}
    Publisher: {call sp_replcmds (500, 0)}
    Status: 4, code: 6002, text: 'Committed batch to distribution database.'.
    Publisher: {call sp_repldone ( NULL, 0x000ac8dc00003b78001e, 0, 125)}
    Publisher: exec sp_replcounters N'APPORTAL'
    Status: 4, code: 22021, text: 'Batch committed. Batch consisted of 12 commands, 12 xacts. Last xact: 0x000ac8dc00003b780001, '0x000ac8dc00003b78001e'.'.
    Publisher: {call sp_replcmds (500, 0)}
    Publisher: {call sp_replcmds (500, 0)}
    Publisher: {call sp_replcmds (500, 0)}
    Publisher: {call sp_replcmds (500, 0)}
    Status: 4, code: 6002, text: 'Committed batch to distribution database.'.
    Publisher: {call sp_repldone ( NULL, 0x000ac8dc00003b8c0001, 0, 31)}
    Publisher: exec sp_replcounters N'APPORTAL'
    Status: 4, code: 22021, text: 'Batch committed. Batch consisted of 1 commands, 1 xacts. Last xact: 0x000ac8dc00003b860001, '0x000ac8dc00003b8c0001'.'.
    Publisher: {call sp_replcmds (500, 0)}
    Publisher: {call sp_replcmds (500, 0)}
    Publisher: {call sp_replcmds (500, 0)}
    Status: 4, code: 6002, text: 'Committed batch to distribution database.'.
    Publisher: {call sp_repldone ( NULL, 0x000ac8dc00003bde0017, 0, 1031)}
    Publisher: exec sp_replcounters N'APPORTAL'
    Status: 4, code: 22021, text: 'Batch committed. Batch consisted of 6 commands, 6 xacts. Last xact: 0x000ac8dc00003bbf0003, '0x000ac8dc00003bde0017'.'.
    Publisher: {call sp_replcmds (500, 0)}
    Publisher: {call sp_replcmds (500, 0)}
    Status: 4, code: 6002, text: 'Committed batch to distribution database.'.
    Publisher: {call sp_repldone ( NULL, 0x000ac8dc00003c450001, 0, 78)}
    Publisher: exec sp_replcounters N'APPORTAL'
    Status: 4, code: 22021, text: 'Batch committed. Batch consisted of 7 commands, 7 xacts. Last xact: 0x000ac8dc00003c3e0001, '0x000ac8dc00003c450001'.'.
    Publisher: {call sp_replcmds (500, 0)}
    Publisher: {call sp_replcmds (500, 0)}
    Publisher: {call sp_replcmds (500, 0)}
    Publisher: {call sp_replcmds (500, 0)}
    Publisher: {call sp_replcmds (500, 0)}
    Publisher: {call sp_replcmds (500, 0)}
    Publisher: {call sp_replcmds (500, 0)}
    Status: 4, code: 6002, text: 'Committed batch to distribution database.'.
    Publisher: {call sp_repldone ( NULL, 0x000ac8dc00003c770031, 0, 406)}
    Publisher: exec sp_replcounters N'APPORTAL'
    Status: 4, code: 22021, text: 'Batch committed. Batch consisted of 3 commands, 3 xacts. Last xact: 0x000ac8dc00003c770001, '0x000ac8dc00003c770031'.'.
    Publisher: {call sp_replcmds (500, 0)}
    Status: 4, code: 6002, text: 'Committed batch to distribution database.'.
    Publisher: {call sp_repldone ( NULL, 0x000ac8dc00003ca00027, 0, 187)}
    Publisher: exec sp_replcounters N'APPORTAL'
    Status: 4, code: 22021, text: 'Batch committed. Batch consisted of 1 commands, 1 xacts. Last xact: 0x000ac8dc00003c9f0001, '0x000ac8dc00003ca00027'.'.
    Publisher: {call sp_replcmds (500, 0)}
    Publisher: {call sp_replcmds (500, 0)}
    Status: 4, code: 6002, text: 'Committed batch to distribution database.'.
    Publisher: {call sp_repldone ( NULL, 0x000ac8dc00003eca0029, 0, 250)}
    Publisher: exec sp_replcounters N'APPORTAL'
    Status: 4, code: 22021, text: 'Batch committed. Batch consisted of 19 commands, 17 xacts. Last xact: 0x000ac8dc00003eca0001, '0x000ac8dc00003eca0029'.'.
    Publisher: {call sp_replcmds (500, 0)}
    Publisher: {call sp_replcmds (500, 0)}
    Status: 4, code: 6002, text: 'Committed batch to distribution database.'.
    Publisher: {call sp_repldone ( NULL, 0x000ac8dc00003eea0001, 0, 32)}
    Publisher: exec sp_replcounters N'APPORTAL'
    Status: 4, code: 22021, text: 'Batch committed. Batch consisted of 1 commands, 1 xacts. Last xact: 0x000ac8dc00003ee40001, '0x000ac8dc00003eea0001'.'.
    Publisher: {call sp_replcmds (500, 0)}
    Publisher: {call sp_replcmds (500, 0)}
    Publisher: {call sp_replcmds (500, 0)}
    Status: 4, code: 6002, text: 'Committed batch to distribution database.'.
    Publisher: {call sp_repldone ( NULL, 0x000ac8dc00003ef20001, 0, 16)}
    Publisher: exec sp_replcounters N'APPORTAL'
    Status: 4, code: 22021, text: 'Batch committed. Batch consisted of 1 commands, 1 xacts. Last xact: 0x000ac8dc00003eec0001, '0x000ac8dc00003ef20001'.'.
    Publisher: {call sp_replcmds (500, 0)}
    Publisher: {call sp_replcmds (500, 0)}
    Publisher: {call sp_replcmds (500, 0)}
    Status: 4, code: 6002, text: 'Committed batch to distribution database.'.
    Publisher: {call sp_repldone ( NULL, 0x000ac8dc00003f28001e, 0, 157)}
    Publisher: exec sp_replcounters N'APPORTAL'
    Status: 4, code: 22021, text: 'Batch committed. Batch consisted of 4 commands, 4 xacts. Last xact: 0x000ac8dc00003f280001, '0x000ac8dc00003f28001e'.'.
    Publisher: {call sp_replcmds (500, 0)}
    Publisher: {call sp_replcmds (500, 0)}
    Status: 4, code: 6002, text: 'Committed batch to distribution database.'.
    Publisher: {call sp_repldone ( NULL, 0x000ac8dd000000460027, 0, 157)}
    Publisher: exec sp_replcounters N'APPORTAL'
    Status: 4, code: 22021, text: 'Batch committed. Batch consisted of 11 commands, 11 xacts. Last xact: 0x000ac8dd000000460001, '0x000ac8dd000000460027'.'.
    Publisher: {call sp_replcmds (500, 0)}
    Publisher: {call sp_replcmds (500, 0)}
    Status: 4, code: 6002, text: 'Committed batch to distribution database.'.
    Publisher: {call sp_repldone ( NULL, 0x000ac8de000000940001, 0, 203)}
    Publisher: exec sp_replcounters N'APPORTAL'
    Status: 4, code: 22021, text: 'Batch committed. Batch consisted of 25 commands, 25 xacts. Last xact: 0x000ac8de0000008b0002, '0x000ac8de000000940001'.'.
    Publisher: {call sp_replcmds (500, 0)}
    Publisher: {call sp_replcmds (500, 0)}
    Publisher: {call sp_replcmds (500, 0)}
    Publisher: {call sp_replcmds (500, 0)}
    Publisher: {call sp_replcmds (500, 0)}
    Status: 4, code: 6002, text: 'Committed batch to distribution database.'.
    Publisher: {call sp_repldone ( NULL, 0x000ac8de000001ce0026, 0, 140)}
    Publisher: exec sp_replcounters N'APPORTAL'
    Status: 4, code: 22021, text: 'Batch committed. Batch consisted of 18 commands, 18 xacts. Last xact: 0x000ac8de000001ce0001, '0x000ac8de000001ce0026'.'.
    Publisher: {call sp_replcmds (500, 0)}
    Publisher: {call sp_replcmds (500, 0)}


    ********************************** STATISTICS *********************************
    Execution time (ms): 346703
    Work time (ms): 10203
    Distribute Repl Cmds Time(ms): 7904
    Fetch time(ms): 5302
    Repldone time(ms): 217
    Write time(ms): 5205
    Num Trans: 552 Num Trans/Sec: 54.101735
    Num Cmds: 560 Num Cmds/Sec: 54.885818
    *******************************************************************************

    Status: 1, code: 22037, text: 'The process was successfully stopped.'.
    The process was successfully stopped.
    Disconnecting from Publisher 'MySTGVIRPUB01'

    John Zacharkan


    John Zacharkan

  • I dont see any errors. Do you have the log reader set to continuous (-continuous), it's the default setting?

    Andy

    http://www.sqlservercentral.com/columnists/awarren/

  • I agree, that's the dang thing. Yep log reader is set continous.

    Here's the kicker this is an Active/Active cluster seperately it fails it looks like when there is nothing to replicate the job exits out, I don't know why. When I fail over the cluster and both sql servers are on the same server it works. If it never processed the transactions when its in the active/active mode I might understand or at least have something to look at, But it process whatever transactions that are out there and when there's nothing left it quits. I don't get it.

    Any ideas before I call MS? We have to pay here per incident, I'm working on getting premier support.

    quote:


    I dont see any errors. Do you have the log reader set to continuous (-continuous), it's the default setting?

    Andy

    http://www.sqlservercentral.com/columnists/awarren/


    John Zacharkan


    John Zacharkan

  • Here's a lousy workaround I'll send it back to step one when it fails. The only negative (maybe) it'll never die when and if there's a true failure on the publisher.

    This is not a solution but a band-aide.

    John Zacharkan


    John Zacharkan

  • Okay no help from MS yet, let me run this across all you all.

    If I open up cmd.exe and run "c:\mypath\logread.exe" -publisher [mypubsvr] -publisherDB [mypubdb] -Distributor [myDist] -DistributorSecurityMode 1 -Continuous

    The process runs hot and normal.

    When I take the same command and throw it into the SQL Agent replacing the Replication Transaction-Log Reader job step with the one command above, execute it as a CmdExec, the job step runs for a bit, then starts going through the all retry attempts and then it fails.

    The repl agent never went through the retries, but like CmdExec it fails. From the command promt it never fails. Hmmm problem with the com object and SQL server Agent?

    Any thoughts

    John Zacharkan


    John Zacharkan

  • reapplied sp3 on both distributor and publisher, still failing. A new install is a bit of work

    No answer yet from MS.

    John Zacharkan


    John Zacharkan

  • I have no idea but, I'm dying to see the resolution. please continue to post your findings.

  • Ok, everyone ready for the big answer from Microsoft?

    The problem appears to be within the log reader com object and network connectivity. The problem was only reproducible at Microsoft by pulling the network connection between the servers. The problem had occured once before however they was no write up on it. I have requested that the problem be documented along with the solution/workaround.

    Solution, at this time Microsoft has no plan on fixing this problem.

    Workaround - Microsoft agrees with my band aide fix of when step 2 of the repl log reader drops out that I redirect the job step to step 1. Basically keeping the process in an infinite loop. The alternative was to place the job in repetative schedule.

    I am anticipating an official write up from Microsoft which will totally negate everything I said here. As soon as I receive a copy of the write up I'll be certain to post it here.

    John Zacharkan


    John Zacharkan

  • Hopefully, this should be last post on the topic. As promised for what it's worth here's the official written word from Microsoft

    Hi John,

    As agreed yesterday , I am providing you with a brief summary of our telephonic conversation.

    CURRENT ISSUE :

    ===============

    -- You have a replication topology that involves a remote distributor running as a clustered instance of SQL 2K.

    -- you have observed a problem with the log reader agent .

    -- The log reader seems to be performing its tasks successful but at the end of its processing , it reports

    the following message in the agent history .

    “The step did not generate any output. The step failed"

    -- This problem does not occur if both the publisher and the distributor instances are running off the same node of the

    cluster .

    EXPLANATION OF THE PROBLEM :

    ============================

    -- The above error is reproducible only in cases where a temporary network outage is introduced while the logreader is running .

    -- We suspect this to be the root cause of your problem .

    -- Since the implementation of the logreader from within a SQL Agent job involves multiple components , adding additional

    error handling at this level of the product might have side effects on other components .

    -- The error message that is being observed is cosmetic and does not indicate serious problem with the logreader .

    -- Additionally , if the logreader executable is run outside the SQL Agent , it does not fail.

    SUGGESTED SOLUTION/WORKAROUND :

    ==============================

    -- Set the logreader to run at frequent intervals instead of having it run continuously .

    -- Set the flow of the steps within the job such that the next step after a "Run Agent" failure is to go back to the

    first step.

    -- Have the distributor and the publisher run on the same instance of SQL server .

    Hopefuly this will save somebody out there some pain. Thank you to all who lent an ear or a suggestion.

    Cheers

    John Zacharkan


    John Zacharkan

Viewing 15 posts - 1 through 14 (of 14 total)

You must be logged in to reply to this topic. Login to reply