Blog Post

SQL Server Checksum

,

Use SQL Server Checksum in Data Warehousing

When building data warehouses you often need a fast way to know if the data in the warehouse is different from the incoming source data.  One method to do this is to create a SQL Server CHECKSUM column in the data warehouse table and another on the incoming table data.  By creating a checksum you can compare the checksum value in the data warehouse tables to the checksum value in your staging table.  If the values are different only then do you need to modify the data warehouse table.

For example, this script calculates a checksum column on the DimGeography table using all columns except [GeographyKey] in the AdventureWorkDW sample database.

SELECT TOP 10 [GeographyKey]
      ,[City]
      ,[StateProvinceCode]
      ,[StateProvinceName]
      ,[CountryRegionCode]
      ,[EnglishCountryRegionName]
      ,[SpanishCountryRegionName]
      ,[FrenchCountryRegionName]
      ,[PostalCode]
      ,[SalesTerritoryKey]
      ,checksum( city, StateProvinceCode, StateProvinceName, countryregioncode, EnglishCountryRegionName,
                  SpanishCountryRegionName, FrenchCountryRegionName, PostalCode, SalesTerritoryKey) as DW_Checksum
  FROM [AdventureWorksDW2008R2].[dbo] .[DimGeography]
Results:SQL Server Checksum
When loading new records into this table, comparing a Checksum on the source table that matches the same columns in the DimGeography (used in the script above) .  If the DW_Checksum values match then you do not need to modify the record.  If they are different then you know what records need to be updated.  In larger tables creating the checksum as a physical column and adding an index will increase the speed of your loading process.
 
For more information on CHECKSUM see MSDN: http://msdn.microsoft.com/en-us/library/ms189788.aspx

The post SQL Server Checksum appeared first on Derek E Wilson - Blog.

Rate

You rated this post out of 5. Change rating

Share

Share

Rate

You rated this post out of 5. Change rating