Change Data Capture and indefinite retention

  • I have an application group who wants to use CDC as a long term solution for tracking changes. They dont want to move historical data to archive tables or anything like that- they want to just enable CDC, with no cleanup. To quote a wise smuggler, "I have a baaaad feeling about this... "

    Admittedly, they intend to implement this on a database without a great deal of activity presently, as a trial, so it's hard for me to explain exactly why I don't have the warm and fuzzy for this. Basically, I'm thinking back to a situation where I inherited a database in merge replication with indefinite retention which had not be reiniailized in 5 years. The metadata tables were enormous.

    Is there any documentation/white paper I can point to that discussed the pitfalls/dangers associated with using CDC with indefinite retention? Can anyone point to an impact this would have on indexing? How about any impact on backup/recovery?

    Thanks in advance,

    Warren

  • :hehe:

    Dear ,i found good link 4 that issue

    http://64.4.10.145/en-us/magazine/cc987538

    http://www.mssqltips.com/sqlservertip/1755/process-change-data-capture-in-sql-server-integration-services/

  • johnitech.itech (1/3/2012)


    :hehe:

    Dear ,i found good link 4 that issue

    http://64.4.10.145/en-us/magazine/cc987538

    http://www.mssqltips.com/sqlservertip/1755/process-change-data-capture-in-sql-server-integration-services/

    Hmm, this doesnt really address the issue I was asking about- indefinite retention and the pitfalls of such. I have a blackbelt in googlejitsu, so I've read the obvious. I'm looking for some firsthand knowledge or any whitepapers specifically addressing using CDC for longterm storage of change data.

  • Hi,

    Like you NJ-DBA, this doesn't 'feel right' to me either, mostly because of the way that CDC operates in the background.

    I'll look for something with more meat on it, but even in the TechNet discussion of CDC

    http://technet.microsoft.com/en-us/library/cc645937.aspx"> http://technet.microsoft.com/en-us/library/cc645937.aspx

    they state that there is an administration concern with CDC and that it must be pruned periodically.

    "Data that is deposited in change tables will grow unmanageably if you do not periodically and systematically prune the data."

    I believe that I may have a document that's more in line with what you're asking for. I'll track it down and post the reference.

    Steve

  • Thanks Steve- much appreciated

  • We recently had serious issues develop with just 90 days retention on one table in a busy database. The change table was 333 GB for a 31 GB source table.

  • blackbelt in googlejitsu

    lol

    [font="Courier New"]Looking for a Deadlock Victim Support Group..[/font]
  • NJ-DBA (1/3/2012)


    I have an application group who wants to use CDC as a long term solution for tracking changes. They dont want to move historical data to archive tables or anything like that- they want to just enable CDC, with no cleanup. To quote a wise smuggler, "I have a baaaad feeling about this... "

    Admittedly, they intend to implement this on a database without a great deal of activity presently, as a trial, so it's hard for me to explain exactly why I don't have the warm and fuzzy for this. Basically, I'm thinking back to a situation where I inherited a database in merge replication with indefinite retention which had not be reiniailized in 5 years. The metadata tables were enormous.

    Is there any documentation/white paper I can point to that discussed the pitfalls/dangers associated with using CDC with indefinite retention? Can anyone point to an impact this would have on indexing? How about any impact on backup/recovery?

    Thanks in advance,

    Warren

    I think that the documentation makes it fairly clear that this is not something the CDC is intended for, especially the default 3 day cleanup period.

    I would turn the discussion around, and ask them to explain why this will work and to supply examples of someone who has implemented that idea successfully.

  • NJ-DBA (1/3/2012)


    I have an application group who wants to use CDC as a long term solution for tracking changes. They dont want to move historical data to archive tables or anything like that... "

    Is that application group aware that CDC creates "Change Tables" that are by definition one of the "anything like that" things that they want to avoid?

    _____________________________________
    Pablo (Paul) Berzukov

    Author of Understanding Database Administration available at Amazon and other bookstores.

    Disclaimer: Advice is provided to the best of my knowledge but no implicit or explicit warranties are provided. Since the advisor explicitly encourages testing any and all suggestions on a test non-production environment advisor should not held liable or responsible for any actions taken based on the given advice.
  • Chuck Hottle (1/4/2012)


    We recently had serious issues develop with just 90 days retention on one table in a busy database. The change table was 333 GB for a 31 GB source table.

    There you go NJ-DBA.

    I forgot to mention the three day default cleanup (thank you Michael), which is also a strong clue that the change tables will grow like wildfires.

    Additionally, the native functions to retrieve changed data

    e.g.-

    cdc.fn_cdc_get_all_changes_capture_instance ( from_lsn , to_lsn , '<row_filter_option>' )

    are tied to LSN's ... I'm not clear that they would be well suited for general reporting in the way that your app team would need it if ever required.

Viewing 10 posts - 1 through 9 (of 9 total)

You must be logged in to reply to this topic. Login to reply