Technical Article

Poor Man's Table Difference

,

The code contains a stored procedure and code to show how it works aftwerwards.

PURPOSE: I need to sync data between two databases. One database resides on a computer without an internet connection.

The plan is to create a text file containing information from the isolated computer's SQL-Express database that can be put on a flash drive. This procedure will provide that data. Once the user is at a internet-ready machine, another application can use the text document to pull down only changes from a website that has the master database. In my case, data goes one direction: to the isolated computer's database. If anything needs to ship the other direction, the user specifically selects it.

THEORY: The procedure builds a custom query which will group on the primary key, and create a checksum of all fields we care about.

The Checksum is created

  • for each row: combining all fields into a text string and doing a HASHBYTES on that
  • sum parts of the checksum into one bigint

You're welcome offer comments on it's design and thoughts about how unique the hash may end up being.

USAGE:

Should be easy to see how to use the procedure. The procedure expects

  • @TableName can be a temp table or table in another database
  • @PrimaryKey data type is flexible. You could even pass it two fields if needed, as in 'HalfAUniqueKey, OtherHalfOfUniqueKey'
  • @FieldList list of fields (normally separated by a semicolon) that we care to check for changes. Data types can be mixed. I've tested it on numeric and some text fields.
  • @FieldListDelimiter optional..you see the default is a semicolon
  • @NullText optional..varchar of something the database will never see in reality
  • @DEBUG optional..if set to 1, will print the SQL code rather than running it

The procedure returns a table, which can then be linked to other tables in the database for further processing. In my case, I link it to the table with the primary keys, and pull out the human-readable description rather than the ID.

 

Again, feel free to offer comments.

 

 

Kurt

create procedure spRowChecksum (
    @TableName varchar(200), 
    @ForeignKey varchar(20),
    @FieldList varchar(2000), 
    @FieldListDelimiter varchar(1) = ';', 
    @NullText varchar(20) = '~~',
    @DEBUG bit = 0) as

declare @SQL nvarchar(4000)

-- Expand @FieldList to SQL code
set @SQL = '), ''' + @NullText +''')'
set @SQL = 
    replace(
        replace(
            replace(
                ';1' + replace(@FieldList, ';', ';3;2;1') + ';3', 
                ';1', 
                'isnull(convert(varchar, '
            ), 
            ';2', 
            ' + '';'' + '
        ), 
        ';3', 
        @SQL
    )

-- fill in the rest
set @SQL = 
'select 
    ' + @ForeignKey + ',
    sum(convert(bigint, convert(binary(4), substring(cksum, 1,4)))) +
    sum(convert(bigint, convert(binary(4), substring(cksum, 5,4)))) +
    sum(convert(bigint, convert(binary(4), substring(cksum, 9,4)))) +
    sum(convert(bigint, convert(binary(4), substring(cksum,13,4)))) +
    sum(convert(bigint, convert(binary(4), substring(cksum,17,4)))) +
    sum(convert(bigint, convert(binary(4), substring(cksum,21,4)))) +
    sum(convert(bigint, convert(binary(4), substring(cksum,25,4)))) +
    sum(convert(bigint, convert(binary(4), substring(cksum,29,4)))) +
    sum(convert(bigint, convert(binary(4), substring(cksum,33,4)))) +
    sum(convert(bigint, convert(binary(4), substring(cksum,37,4)))) cksum
from 
    (select
        ' + @ForeignKey + ', HASHBYTES(''sha'', ' + @SQL + ') cksum
    from ' + @TableName + ') data
group by ' + @ForeignKey

if @DEBUG = 1
    print @SQL
else
    exec (@SQL)

go

-------------------
-- Test code

create table #Table1 (pk int identity, fk int, field1 int, field2 varchar(10), lastfield char(2))
create table #Table2 (pk int identity, fk int, field1 int, field2 varchar(10), lastfield char(2))

insert into #Table1 (fk, field1, field2, lastfield) values (1, 4, 'testing1', 'x')
insert into #Table1 (fk, field1, field2, lastfield) values (2, 5, 'testing2', null)
insert into #Table1 (fk, field1, field2, lastfield) values (3, 60, 'testing3', 'z')
insert into #Table1 (fk, field1, field2, lastfield) values (4, 160, 'testing3', 'z')

insert into #Table2 (fk, field1, field2, lastfield) values (1, 4, 'testing1', 'x')
insert into #Table2 (fk, field1, field2, lastfield) values (2, 5, 'testing2', null)
insert into #Table2 (fk, field1, field2, lastfield) values (3, 62, 'testing3', 'z')


create table #ckTable1 (fk int, cksum bigint)
create table #ckTable2 (fk int, cksum bigint)

insert into #ckTable1
exec spRowChecksum '#Table1', 'fk', 'field1;field2;lastfield'

insert into #ckTable2
exec spRowChecksum '#Table2', 'fk', 'field1;field2;lastfield'

-- show mismatches
select *
from #ckTable1 t1
full outer join #ckTable2 t2
on t1.fk = t2.fk
where 
    t1.cksum != t2.cksum or
    t1.fk is null or
    t2.fk is null

Rate

(2)

You rated this post out of 5. Change rating

Share

Share

Rate

(2)

You rated this post out of 5. Change rating