RE: Speeding Up Incremental ETL Proceses in SSIS by Using MD5 Hashes

SSChampion

Points: 13482

June 8, 2010 at 10:36 am

Great article thanks! I can see where it might be able to speed things up, but only if you are constrained on your writes. In my current ETL process I am constrained reading the data, as it comes over a WAN link, so all this would end up doing is complicating things and adding more CPU overhead. Now if I could store the hash in the source DB, and only pull changed records over the WAN that could make a huge difference.

You actually end up doing more reads, so if the percent of changes is very high you could actually perform worse. (Or if you are CPU bound the hash calculation could end up slowing you down.)

The other thing I would like to see it deal with it deletes, if records are deleted from the source, they would stay in your destination...