SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 

Devin Knight

Add to Technorati Favorites Add to Google
Author Bio
Devin is a BI consultant at Pragmatic Works Consulting. Previously, he has tech edited the book Professional Microsoft SQL Server 2008 Integration Services and was an author in the book Knight's 24-Hour Trainer: Microsoft SQL Server 2008 Integration Services. Devin has spoken at past conferences like PASS, Code Camps and several SQL Saturday events. He is a contributing member to the Business Intelligence Special Interest Group (SIG) for PASS as a leader in the SSIS Focus Group. Making his home in Jacksonville, FL, Devin is a participating member of the local users’ group (JSSUG).
 

SSIS – CHECKSUM Transformation

By knight_devin@hotmail.com in Devin Knight | 01-05-2009 8:08 AM | Categories:
Rating: (not yet rated) |  Discuss | 3,013 Reads | 126 Reads in Last 30 Days |7 comment(s)

CHECKSUM is a TSQL function that computes a hash value over a list of arguments.  The great thing about checksum2creating a hash value for a table is that it gives you a unique value to represent each row. 

When would you use the CHECKSUM Function?

An example of using a CHECKSUM is to store the unique value for the entire row in a column that can be used later for comparison.  This is useful in a condition where all of the rows in a table need to be compared to perform an UPDATE.

When would you use the CHECKSUM Transformation?

Now that you have a little background on what CHECKSUM is it’s time to learn how to use it in SSIS.  The CHECKSUM Transformation is not included by default in SSIS but it was developed by Konesans and can be downloaded from www.sqlis.com

checksum

Normally in a package that is required to check for changes in data every single column from the destination must be compared to the incoming source.  With the CHECKSUM Transform creating a unique value for each row you only have to compare one column and that’s your CHECKSUM Value.

Checksum != DestCheckSum

To configure the CHECKSUM Transform open the editor and check the columns that you want to have the hash value created for and give the output column a name.

This is a great tool and a big time saver when developing!

Comments
 

Steve Jones said:

Good to see you blogging and nice summary here. I might include a slightly longer example of where this makes sense, perhaps a scenario that you've used it.

January 5, 2009 11:27 AM
 

knight_devin@hotmail.com said:

Thanks for the advice.  Maybe this topic would be a good article if I elaborated a lot more.

January 5, 2009 12:00 PM
 

Steve Jones said:

It definitely would be. It's a good blog length and a specific or two would help, but I wouldn't walk through step by step. If you do that, make it an article and submit it me or another publisher.

January 6, 2009 1:50 PM
 

markve said:

It does not work for the CRC32 algorithm. Create an SQL table ChecksumTest with 1 or more columns. Fill it with the same values for about 5 rows. You will see that the algorithm "Original" and "FrameworkChecksum" works fine. The results in all rows are the same. For the CRC32 algorithm the value is different for each row.

Anyone got an idea what's going wrong here?

January 8, 2009 6:19 AM
 

knight_devin@hotmail.com said:

That's interesting I haven't tried using that algorithm.  I did find this link explaining the CRC32 algorithm and it says not to use it for data integrity.  I don't really see  a good reason why though.

www.accuhash.com/what-is-crc32.html

January 8, 2009 6:35 AM
 

Hubajube said:

You can't use CRC32 if you must guarantee that you catch every change.  That is, there exist many inputs that have the same checksum.

The CRC32 algorithm bug over at sqlis.com is a different beast.  It seems like they are including the row number in the input somehow, meaning that identical values will produce different checksums if they exist in different rows.  That's unfortunate because it makes this algorithm unusable.

January 13, 2009 12:50 PM
 

rmg66 said:

I discovered the same "bug" myself. Since the other two options (Original and FrameworkChecksum) have known issues I want to avoid using them, but this issue with the CRC32 option makes it impossible to use. I have breifly reviewed the code (I'm not a dot.net programmer by any means) but it seems the stream that you are pushing to the hash function is somehow grabbing the internal row number on each row that it processes.

I can't imagine this is by design, unless I'm missing something here.

January 13, 2009 1:01 PM
Leave a Comment
Only members of SQLServerCentral may leave comments. Register now for your free account or Sign-In if you are already a member.