help to fix hash procedure

  • hi, i create a procedure to get hash code from entire table's contents. it works fine in 99% , but in some cases it returns the same number for different data.

    in this code the ord value is swapped for two records, bat the hash is the same.

    is there a way to fix the procedure ?


    SET ANSI_NULLS ON
    GO

    SET QUOTED_IDENTIFIER ON
    GO

    CREATE TABLE [dbo].[TestTable](
    [ID] [int] NOT NULL,
    [DocID] [varchar](10) NOT NULL,
    [Ord] [int] NULL,
    CONSTRAINT [PK_TestTable] PRIMARY KEY CLUSTERED
    (
    [ID] ASC,
    [DocID] ASC
    )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
    ) ON [PRIMARY]
    GO

    SET ANSI_NULLS ON
    GO

    SET QUOTED_IDENTIFIER ON
    GO

    CREATE PROCEDURE [dbo].[GetCRC]
    (
    @Table nvarchar(100)
    )
    AS
    BEGIN

    Declare @CRC nvarchar(100)
    Declare @CRCResult nvarchar(100)
    Declare @SQL nvarchar(500)
    Declare @ParamDefinition nvarchar(100)

    Set @ParamDefinition = N'@CRC nvarchar(100) OUTPUT';
    Set @SQL = 'Select @CRC = CHECKSUM_AGG(BINARY_CHECKSUM(*)) From ' + @Table

    EXECUTE sp_executesql @SQL , @ParamDefinition, @CRC = @CRC output;
    Set @CRCResult = IsNull(@CRC,'0')

    Select IsNull(@CRCResult,'0') as CRC, @Table as TableName

    END

    GO

    Insert into TestTable (ID,DocID,Ord)
    values(1, 'aaa', 1),
    (1, 'bbb', 2),
    (1, 'ccc', 3),
    (1, 'ddd', 4),
    (1, 'eee', 5),
    (1, 'fff', 6),
    (1, 'ggg', 7)

    exec [dbo].[GetCRC] 'TestTable'

    update TestTable set ord=2 where id=1 and DocID='aaa'
    update TestTable set ord=1 where id=1 and DocID='bbb'

    exec [dbo].[GetCRC] 'TestTable'?

     

  • There is no guarantee that CHECKSUM_AGG() will return a unique value for every different recordset, so 'fixing' the proc is not really an option unless you are prepared to use additional hashing methods to reduce the likelihood of collisions.

    But the likelihood will never be zero.

    If you haven't even tried to resolve your issue, please don't expect the hard-working volunteers here to waste their time providing links to answers which you could easily have found yourself.

  • yes, using another approach is a valid option, i need, at least, to reduce collisions.

  • Maybe use a checksum + some other value in the row? A PK perhaps?

Viewing 4 posts - 1 through 3 (of 3 total)

You must be logged in to reply to this topic. Login to reply