Replication - Snapshot Agent won't start via SQL

  • Hello,

    I am trying to set up replication using a least-privilege model and I'm running into some behavior I can't quite figure out.

    Basic information:

    - 2008 RTM (10.0.5500 - dist/pub), (10.0.4279 - sub)

    - Transactional replication (push)

    - Distributor and publisher on same instance

    - Remote subscriber

    - Separate domain accounts for snapshot, log reader, distributor agents

    If I set up "vanilla" replication using the SQL Agent service and let the replication agents impersonate that account, everything works fine. The problem begins when I configure replication to use the separate, low-priv AD accounts. The snapshot agent fails to run - it starts, but fails with no error message. Both the replication monitor and the snapshot job history say "The replication agent encountered a failure. See the previous job step history..." (This is on step 2 of the job, "Run agent"). The Windows log on the host doesn't show anything that appears to correlate with the failed jobs.

    I have double-checked the permissions for the snapshot agent. It's a member of db_owner in distributor and publisher, it has full control on the folder where the snapshots are stored, and it's a member of the PAL (and so it is mapped to a user in the subscriber db as well). It is also a member of SQLAgentUserRole in msdb (this may not be necessary). The snapshot job is owned by me (sysadmin), and Step 2 is run using a proxy with a credential for the correct domain account. The domain account is not locked out. It has the following permissions on the host OS:

    - Act as part of the operating system

    - Adjust memory quotas for a process

    - Bypass traverse checking

    - Create global objects

    - Impersonate a client after authentication

    - Lock pages in memory

    - Log on as a batch job

    - Log on as a service

    - Replace a process level token

    Other things I've noted:

    - If I add the domain account to the Local Administrators group, I can execute the snapshot job from SSMS

    - If I remove the domain account from Local Administrators, log in to the host as that domain account, and run the snapshot agent from the command line using the flags in the Job Step, it works, and Replication Monitor shows everything working properly

    - Having confirmed that the snapshot agent domain account seems to have the OS permissions it needs, since it runs locally, I go back to SSMS and try to execute the snapshot job, which fails. I can't add the -outputverboselevel flag to the job step.

    This suggests, to me, that the domain account has the rights it needs to run the snapshot agent, and that the breakdown is in the process that communicates between the SQL Agent and the snapshot executable. Is my understanding correct? This is getting into proxies and subsystems which I'm a little new to. I spent all day yesterday scouring forums to get this far, and they mostly amount to "check the NTFS permissions on the snapshot folder for the domain account", and a few "proxy accounts need to be configured with the same OS permissions as the agent service account which MSDN doesn't do a good job of documenting" hence the laundry list of LSP rights above.

    Thanks

  • I think I made a mistake when adding the flags to the job step, I had a 2>&1 operator that it didn't like. I'm trying it again with -outputverbose level and -output...

Viewing 2 posts - 1 through 1 (of 1 total)

You must be logged in to reply to this topic. Login to reply