Technical Article

Poor Man's Table Difference

,

The code contains a stored procedure and code to show how it works aftwerwards.

PURPOSE: I need to sync data between two databases. One database resides on a computer without an internet connection.

The plan is to create a text file containing information from the isolated computer's SQL-Express database that can be put on a flash drive. This procedure will provide that data. Once the user is at a internet-ready machine, another application can use the text document to pull down only changes from a website that has the master database. In my case, data goes one direction: to the isolated computer's database. If anything needs to ship the other direction, the user specifically selects it.

THEORY: The procedure builds a custom query which will group on the primary key, and create a checksum of all fields we care about.

The Checksum is created

  • for each row: combining all fields into a text string and doing a HASHBYTES on that
  • sum parts of the checksum into one bigint

You're welcome offer comments on it's design and thoughts about how unique the hash may end up being.

USAGE:

Should be easy to see how to use the procedure. The procedure expects

  • @TableName can be a temp table or table in another database
  • @PrimaryKey data type is flexible. You could even pass it two fields if needed, as in 'HalfAUniqueKey, OtherHalfOfUniqueKey'
  • @FieldList list of fields (normally separated by a semicolon) that we care to check for changes. Data types can be mixed. I've tested it on numeric and some text fields.
  • @FieldListDelimiter optional..you see the default is a semicolon
  • @NullText optional..varchar of something the database will never see in reality
  • @DEBUG optional..if set to 1, will print the SQL code rather than running it

The procedure returns a table, which can then be linked to other tables in the database for further processing. In my case, I link it to the table with the primary keys, and pull out the human-readable description rather than the ID.

 

Again, feel free to offer comments.

 

 

Kurt

create procedure spRowChecksum (
    @TableName varchar(200), 
    @ForeignKey varchar(20),
    @FieldList varchar(2000), 
    @FieldListDelimiter varchar(1) = ';', 
    @NullText varchar(20) = '~~',
    @DEBUG bit = 0) as

declare @SQL nvarchar(4000)

-- Expand @FieldList to SQL code
set @SQL = '), ''' + @NullText +''')'
set @SQL = 
    replace(
        replace(
            replace(
                ';1' + replace(@FieldList, ';', ';3;2;1') + ';3', 
                ';1', 
                'isnull(convert(varchar, '
            ), 
            ';2', 
            ' + '';'' + '
        ), 
        ';3', 
        @SQL
    )

-- fill in the rest
set @SQL = 
'select 
    ' + @ForeignKey + ',
    sum(convert(bigint, convert(binary(4), substring(cksum, 1,4)))) +
    sum(convert(bigint, convert(binary(4), substring(cksum, 5,4)))) +
    sum(convert(bigint, convert(binary(4), substring(cksum, 9,4)))) +
    sum(convert(bigint, convert(binary(4), substring(cksum,13,4)))) +
    sum(convert(bigint, convert(binary(4), substring(cksum,17,4)))) +
    sum(convert(bigint, convert(binary(4), substring(cksum,21,4)))) +
    sum(convert(bigint, convert(binary(4), substring(cksum,25,4)))) +
    sum(convert(bigint, convert(binary(4), substring(cksum,29,4)))) +
    sum(convert(bigint, convert(binary(4), substring(cksum,33,4)))) +
    sum(convert(bigint, convert(binary(4), substring(cksum,37,4)))) cksum
from 
    (select
        ' + @ForeignKey + ', HASHBYTES(''sha'', ' + @SQL + ') cksum
    from ' + @TableName + ') data
group by ' + @ForeignKey

if @DEBUG = 1
    print @SQL
else
    exec (@SQL)

go

-------------------
-- Test code

create table #Table1 (pk int identity, fk int, field1 int, field2 varchar(10), lastfield char(2))
create table #Table2 (pk int identity, fk int, field1 int, field2 varchar(10), lastfield char(2))

insert into #Table1 (fk, field1, field2, lastfield) values (1, 4, 'testing1', 'x')
insert into #Table1 (fk, field1, field2, lastfield) values (2, 5, 'testing2', null)
insert into #Table1 (fk, field1, field2, lastfield) values (3, 60, 'testing3', 'z')
insert into #Table1 (fk, field1, field2, lastfield) values (4, 160, 'testing3', 'z')

insert into #Table2 (fk, field1, field2, lastfield) values (1, 4, 'testing1', 'x')
insert into #Table2 (fk, field1, field2, lastfield) values (2, 5, 'testing2', null)
insert into #Table2 (fk, field1, field2, lastfield) values (3, 62, 'testing3', 'z')


create table #ckTable1 (fk int, cksum bigint)
create table #ckTable2 (fk int, cksum bigint)

insert into #ckTable1
exec spRowChecksum '#Table1', 'fk', 'field1;field2;lastfield'

insert into #ckTable2
exec spRowChecksum '#Table2', 'fk', 'field1;field2;lastfield'

-- show mismatches
select *
from #ckTable1 t1
full outer join #ckTable2 t2
on t1.fk = t2.fk
where 
    t1.cksum != t2.cksum or
    t1.fk is null or
    t2.fk is null

Rate

4 (2)

You rated this post out of 5. Change rating

Share

Share

Rate

4 (2)

You rated this post out of 5. Change rating