Detecting Changes to a Table

  • HansB,

    BTW I was thinking more about your post, and it also occurs to me that if you have a datetime column maintained by triggers, then you could simply use the query: [SELECT MAX(<datetimecol>) FROM Table] as your algorithm for detecting changes to the table, rather than bothering to use a CHECKSUM based approach.

    Best regards,

    Lawrence.

  • In 2008 and 2008 R2 (might only be enterprise), there is the change data capture configuration that could also be put in place.

    http://msdn.microsoft.com/en-us/library/bb522489.aspx

    Regards,

    Steve

  • Lawrence Moore (5/23/2011)


    Hi Eric,

    Many thanks for your post.

    It is true that DMVs offer lots of useful information, some of which could be applied for requirements discussed in my article.

    However, DMVs typically require elevated user permissions, such as VIEW SERVER STATE.

    Regards,

    Lawrence

    I'd think that a process, which does something like querying for changes in a table and selecting the rows out to another table, would not be running under an application or user account. It would probably the SSIS or the SQL Agent account, in which case it would be OK to grant it VIEW SERVER STATE permission.

    "Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho

  • Lawrence/Eric:

    You guys are right. Thanks for pointing that out to me.

  • Although you may use better methods in SQL Server 2008 now, there are many companies which still run MSSQL 2000/2005 and I appreciate the time you took to write this for us.

    Thank you.

    Best regards,

    Andre Guerreiro Neto

    Database Analyst
    http://www.softplan.com.br
    MCITPx1/MCTSx2/MCSE/MCSA

  • Lawrence.

    It always goes back to the same question, what do we need to accomplish? In some cases CHECKSUM might be a valid option, (I am having a hard time coming up with an example in its favor, but that has more to do with the index scan than the symmetric update issue.) but there might be scenario's where this is a valid solution, as triggers might do the trick for (probably most) other people.

    I was merely pointing out that the standard audit columns would alleviate the symmetric update problem, I for one still would use triggers.

    Again, I do appreciate you taking the time you took, and sharing your stuff with us!

    Hans

  • I think if we add one more column to CheckSumTest Table - LstChgDate DateTime, and we update it every operation, insert, update etc.

    The CheckSum_Agg() will also return diff value when symmetric update.:-P

  • 1) Index usage DMV's are emptied when SQL Server is restarted http://msdn.microsoft.com/en-us/library/ms188755.aspx

    2) When properly, precisely done checksums (all columns fixed width or very, very careful delimiting (CHAR(31), perhaps), COALESCE/ISNULLed to an impossible value, etc.) can tell you:

    2a) If two rows are different

    2b) If you need to check further into whether two rows are the same or not; regardless of the theoretical collision chances based on statistical assumptions, I've seen many reports of actual collisions, particularly since real (unencrypted) data very rarely looks random; you can get "Bobs" "cars" and "Bob" "scars" easily enough (see prior comments about properly and precisely done).

    3) Lastupdate type columns yields false "the data changed" results if a table was rebuild even when all data except the lastupdate column is 100% identical.

    4) For full comparisons, I prefer something like:

    WHERE (

    ((Old.[field1] <> New.[field1]) OR ((Old.[field1] IS NULL) AND (New.[field1] IS NOT NULL)) OR ((Old.[field1] IS NOT NULL) AND (New.[field1] IS NULL)))

    OR ((Old.[field2] <> New.[field2]) OR ((Old.[field2] IS NULL) AND (New.[field2] IS NOT NULL)) OR ((Old.[field2] IS NOT NULL) AND (New.[field2] IS NULL)))

    )

    Note that for tables with many columns, having an automated way to build these statements is much better than copy/paste/modify.

    For anyone that insists on using checksums, I would recommend a scheduled process (weekly, monthly, quarterly) that does a thorough check for collisions. Gamblers hope for good look... we should check regularly for bad luck.

  • If you understand what a checksum is then you know whether it is appropriate to use it. It is a type of hash, so sometimes it is appropriate and sometimes it isn't.

    IMHO the business rules will dictate the usefulness rather than the collision probability.

    I think it would be appropriate for BOL to point that out and either provide a reference to additional information or (2) explain how the aggregate is computed.

    Good article though, created a great discussion, I'm giving you a 5 😎

  • If we add a timestamp/rowversion column will the CHECKSUM_AGG() not always give accurate results?

  • I'm looking for the most eficcient way to audit the changes on a particular table. Generaly you need to know which table, row and column where changed. I mean the option is triggers but expend very much time to create one by one on each table. Also we need to check inserted against deleted temp row column by column...

    Does exist another way?

  • fmendes (5/24/2011)


    If we add a timestamp/rowversion column will the CHECKSUM_AGG() not always give accurate results?

    Yes I believe you are right, but this raises the question of what value the CHECKSUM_AGG() is bringing to the party, when you can simply use [SELECT MAX(rowversion) from TABLE] as your "has the table changed algorithm", which could be more efficient with an appropriate index.

    Best regards,

    Lawrence

  • arsinfor (5/24/2011)


    I'm looking for the most eficcient way to audit the changes on a particular table. Generaly you need to know which table, row and column where changed. I mean the option is triggers but expend very much time to create one by one on each table. Also we need to check inserted against deleted temp row column by column...

    Does exist another way?

    Hi arsinfor,

    If you are using SQL 2008, then I believe the new change tracking features could be useful given your requirements.

    http://msdn.microsoft.com/en-us/library/cc280462.aspx

    However, once you start to talk about custom logic based around inserted/deleted rows, then I think you are in trigger territory.

    Hope this helps,

    Lawrence

  • fmendes (5/24/2011)


    If we add a timestamp/rowversion column will the CHECKSUM_AGG() not always give accurate results?

    Nope, not "always".

    Yes, it will (almost always) fix the example given for 'symmetric change' (I think he called it)

    But, it is a hash - so there are collisions.

    A hash on a table that could contain millions of rows and a couple hundred columns is condensed (hashed) down to a single scalar value which is the hash value. So, logically there are some hash values that have more than one source.

    (There are hashes which are 'perfect hashes' but those are restricted to a specific set of data and the source data must be known in advance. Not possible in this discussion)

    HTH,

    -Chris C.

  • Thanks very much Lawrence.

    I am using sql 2005 and I find in this address a possible solution.....

    http://weblogs.asp.net/jgalloway/archive/2008/01/27/adding-simple-trigger-based-auditing-to-your-sql-server-database.aspx

Viewing 15 posts - 16 through 30 (of 30 total)

You must be logged in to reply to this topic. Login to reply