SQL Clone
SQLServerCentral is supported by Redgate
 
Log in  ::  Register  ::  Not logged in
 
 
 


Checksum function???


Checksum function???

Author
Message
JustANumber
JustANumber
SSC-Enthusiastic
SSC-Enthusiastic (189 reputation)SSC-Enthusiastic (189 reputation)SSC-Enthusiastic (189 reputation)SSC-Enthusiastic (189 reputation)SSC-Enthusiastic (189 reputation)SSC-Enthusiastic (189 reputation)SSC-Enthusiastic (189 reputation)SSC-Enthusiastic (189 reputation)

Group: General Forum Members
Points: 189 Visits: 113
I did this recently to determine if a row needed to be updated. I put a checksum column in the table using the values that could possibly be updated. The stored procedure had a checksum variable based on the input variables. The input variables for the proc are also the DB columns that could possibly be updated.

By comparing the DB checksum calculation with the stored procedure checksum variable I knew whether the row needed updating or not.
GSquared
GSquared
SSC Guru
SSC Guru (58K reputation)SSC Guru (58K reputation)SSC Guru (58K reputation)SSC Guru (58K reputation)SSC Guru (58K reputation)SSC Guru (58K reputation)SSC Guru (58K reputation)SSC Guru (58K reputation)

Group: General Forum Members
Points: 58385 Visits: 9730
Mark Pratt (9/1/2009)
I did this recently to determine if a row needed to be updated. I put a checksum column in the table using the values that could possibly be updated. The stored procedure had a checksum variable based on the input variables. The input variables for the proc are also the DB columns that could possibly be updated.

By comparing the DB checksum calculation with the stored procedure checksum variable I knew whether the row needed updating or not.


Makes sense.

- Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
Property of The Thread

"Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon
ducon
ducon
SSC Eights!
SSC Eights! (913 reputation)SSC Eights! (913 reputation)SSC Eights! (913 reputation)SSC Eights! (913 reputation)SSC Eights! (913 reputation)SSC Eights! (913 reputation)SSC Eights! (913 reputation)SSC Eights! (913 reputation)

Group: General Forum Members
Points: 913 Visits: 286
Mark Pratt (9/1/2009)
I did this recently to determine if a row needed to be updated. I put a checksum column in the table using the values that could possibly be updated. The stored procedure had a checksum variable based on the input variables. The input variables for the proc are also the DB columns that could possibly be updated.

By comparing the DB checksum calculation with the stored procedure checksum variable I knew whether the row needed updating or not.


Hi,

Yep, this is exactly the point I was trying to make. Note that is very easily doable using a script component in a data flow and from what we've tried to far, goes pretty fast.

Regards,

Greg
sqlrunner
sqlrunner
SSC-Enthusiastic
SSC-Enthusiastic (113 reputation)SSC-Enthusiastic (113 reputation)SSC-Enthusiastic (113 reputation)SSC-Enthusiastic (113 reputation)SSC-Enthusiastic (113 reputation)SSC-Enthusiastic (113 reputation)SSC-Enthusiastic (113 reputation)SSC-Enthusiastic (113 reputation)

Group: General Forum Members
Points: 113 Visits: 159
If you are using sql 2005, and the columns you are interested in are varchar,nvarchar, or varbinary, you can try making a computed column out of the /b HashBytes /b function. There are a few limits, one being the total input size cannot be greater than 8k ( I think this was fixed in either sp3 or 2008). I also believe it is deterministic so you can put an index on it. Use the SHA1 algorithm as the chance of collision is 1 in 2^80.

Josef Richberg
2009 Exceptional DBA
http://www.josef-richberg.squarespace.com
http://twitter.com/sqlrunner
John Rowan
John Rowan
SSChampion
SSChampion (13K reputation)SSChampion (13K reputation)SSChampion (13K reputation)SSChampion (13K reputation)SSChampion (13K reputation)SSChampion (13K reputation)SSChampion (13K reputation)SSChampion (13K reputation)

Group: General Forum Members
Points: 13764 Visits: 4588
Is there any sort of date column in the source table that could show you which rows have been updated? If you could do something to avoid comparing the million rows, that may help.

Also, have you tried the SCD transformation? Between that and proper indexing on your dimension, you should still get decent performance out of the compare.

How much free reign do you have on the source system? Would it be possible to make changes there to aid in making this easier?

John Rowan

======================================================
======================================================
Forum Etiquette: How to post data/code on a forum to get the best help - by Jeff Moden
vishal.gamji
vishal.gamji
SSCommitted
SSCommitted (1.7K reputation)SSCommitted (1.7K reputation)SSCommitted (1.7K reputation)SSCommitted (1.7K reputation)SSCommitted (1.7K reputation)SSCommitted (1.7K reputation)SSCommitted (1.7K reputation)SSCommitted (1.7K reputation)

Group: General Forum Members
Points: 1670 Visits: 532
I have found the checksum to be useful in some cases and unreliable in some. If the data set if distinct then the checksum function (of all rows stands good), but in another case i had to get 27 Million rows from DB2 with no primary key, no uniqueness on the table, so i thought of using checksum to compare the rows in a staging environment. Not only did i have the overhead of creating the checksum, i also had to compare it + the checksum gives me a integer value based on the row uniqueness, where i had duplicates and everything was getting messed up..... so in short i would think the usage totally depends on the case.
John Rowan
John Rowan
SSChampion
SSChampion (13K reputation)SSChampion (13K reputation)SSChampion (13K reputation)SSChampion (13K reputation)SSChampion (13K reputation)SSChampion (13K reputation)SSChampion (13K reputation)SSChampion (13K reputation)

Group: General Forum Members
Points: 13764 Visits: 4588
The OP has not responded for quite some time now.

SQL Learner? Are you still working on this? We'd appreciate some feedback as to how you've resolved the problem. Having that in the thread will help others that come accross this thread down the road.

John Rowan

======================================================
======================================================
Forum Etiquette: How to post data/code on a forum to get the best help - by Jeff Moden
Go


Permissions

You can't post new topics.
You can't post topic replies.
You can't post new polls.
You can't post replies to polls.
You can't edit your own topics.
You can't delete your own topics.
You can't edit other topics.
You can't delete other topics.
You can't edit your own posts.
You can't edit other posts.
You can't delete your own posts.
You can't delete other posts.
You can't post events.
You can't edit your own events.
You can't edit other events.
You can't delete your own events.
You can't delete other events.
You can't send private messages.
You can't send emails.
You can read topics.
You can't vote in polls.
You can't upload attachments.
You can download attachments.
You can't post HTML code.
You can't edit HTML code.
You can't post IFCode.
You can't post JavaScript.
You can post emoticons.
You can't post or upload images.

Select a forum

































































































































































SQLServerCentral


Search