Enabling CDC for hundreds of databases on one instance, do or do not?

  • Hi all,

    I've been given an interesting topic to think about recently in the company I work for. Namely, we are in need of changing our "auditing" approach, meaning replacing many "old-style" very bad performing triggers to track changes made to several columns from several tables.

    CDC is our obvious first option, but the challenge we face is that we would need to enable it for about 500 hundred databases on a SQL 2008 SP1 Cluster. Databases are not too big, averaging about 2 GB. With two agent jobs created PER database, how would 1000 agent jobs affect the server? Any performance factors I need to take into account as well?

    Any input or previous similar experience would be greatly appreciated.

    Thx

  • CDC should perform far better than hand-written triggers on all those tables.

    Make sure you have sufficient pre-allocated log space to hold the changes until CDC can process them all.

    SQL DBA,SQL Server MVP(07, 08, 09) A socialist is someone who will give you the shirt off *someone else's* back.

  • Thanks for your reply.

    Yes I'm aware it should perform better, but I'm still somewhat concerned of having to have 1000+ agent jobs on the SQL Instance. Any way of consolidating those into one or two?

    Thx

  • ScottPletcher (5/12/2014)


    CDC should perform far better than hand-written triggers on all those tables.

    Make sure you have sufficient pre-allocated log space to hold the changes until CDC can process them all.

    Rumor has it that CDC doesn't identify the user that made the change in a lot of cases (rumor says all cases but I hate infinite enumerations :-D).

    Besides, I wouldn't ever handwrite a trigger... I use a stored procedure to do that kind of things for me.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • tomes12 (5/12/2014)


    Hi all,

    I've been given an interesting topic to think about recently in the company I work for. Namely, we are in need of changing our "auditing" approach, meaning replacing many "old-style" very bad performing triggers to track changes made to several columns from several tables.

    CDC is our obvious first option, but the challenge we face is that we would need to enable it for about 500 hundred databases on a SQL 2008 SP1 Cluster. Databases are not too big, averaging about 2 GB. With two agent jobs created PER database, how would 1000 agent jobs affect the server? Any performance factors I need to take into account as well?

    Any input or previous similar experience would be greatly appreciated.

    Thx

    I wouldn't be too concerned about the agent jobs. I'd be concerned about the amount of memory each logreader would consume, plus each of those would be single threaded and take away valuable resources from SQL Server. As much as it concerns me, I'd look at optimising the triggers or reviewing the requirements.

  • You have broad control of the frequency of cdc scan processing, and can even customize it if you want. I wouldn't ever want to risk the never-ending maintenance of custom triggers across that many tables.

    SQL DBA,SQL Server MVP(07, 08, 09) A socialist is someone who will give you the shirt off *someone else's* back.

  • ScottPletcher (5/13/2014)


    You have broad control of the frequency of cdc scan processing, and can even customize it if you want. I wouldn't ever want to risk the never-ending maintenance of custom triggers across that many tables.

    That's true but it will impact the transaction log use and growth. You can create a custom code to generate the triggers. I don't think there's an ideal solution with the information given

  • MysteryJimbo (5/13/2014)


    ScottPletcher (5/13/2014)


    You have broad control of the frequency of cdc scan processing, and can even customize it if you want. I wouldn't ever want to risk the never-ending maintenance of custom triggers across that many tables.

    That's true but it will impact the transaction log use and growth. You can create a custom code to generate the triggers. I don't think there's an ideal solution with the information given

    Exactly, which is why I stated this earlier:

    Make sure you have sufficient pre-allocated log space to hold the changes until CDC can process them all.

    I assumed some degree of extra control would be necessary with hundreds of databases and tables involved.

    SQL DBA,SQL Server MVP(07, 08, 09) A socialist is someone who will give you the shirt off *someone else's* back.

  • Hi,

    thanks for the replies. As for log use and growth, I'd be fine with it. I can assign enough space and dedicated luns for them. The comment about having 1000 agent jobs and the memory each logreader would consume is what concerns me. The server is not under memory pressure nor CPU pressure at all. CPU is averaging about 10-15%

    I was reading through MSDN (http://msdn.microsoft.com/en-us/library/cc645591.aspx) and confirmed my previous thoughts of somehow consolidating all the agent jobs into one. This line suggests that customizations are doable: "By replacing the delivered job step calling sp_cdc_scan with a call to a user written wrapper for sp_cdc_scan, highly customized behavior can be obtained with little additional effort"

    So, does it sound too crazy to create a job that will wrap the call to sp_cdc_scan in some kind of a loop through all the databases? (obviously calling it with the proper parameters and making the necessary adjustments in msdb.dbo.cdc_jobs if at all needed)

    Would that somehow change the memory consumption (doubt it) of each logreader or would it only make it easier for me from a Agent Job management aspect?

    I've googled this so much now, it's really strange to not find any case of someone enabling CDC on more than one database, or hundreds of them for that matter.

    Thx,

Viewing 9 posts - 1 through 8 (of 8 total)

You must be logged in to reply to this topic. Login to reply