Alternate to CHECKSUM ??

  • Hi All, Does any one have (/know) a better algorithm which can generate unique checksum for given set of columns?

    I am using SQL 2005 and though bol mentioned that sql 2005 has improved algorithm for checksum ,than sql 2000 I found that it works well for small set of rows and produce same checksum  if rows exceed 1L+ or so..  

    Reason I am using checksum is,  I have to perform Joins on tables that involves more than 10-15  columns (with 1ML rows) so having index on those columns is not giving me good performance as compared to checksum.

    pls suggest.  

       

    Regards
    Shrikant Kulkarni

  • You could try a checksum with an explicit column list, and change the order of the columns. This can help to reduce duplicate values. what are the columns you are performing CHECKSUM() on? Can you post some sample data?

    Tim Wilkinson

    "If it doesn't work in practice, you're using the wrong theory"
    - Immanuel Kant

  • Hi stax, thanks for suggestion but in my case both the table and columns get generated dynamically during process. Here is query I want to optimize. creating non-clustered index is not much helpful (1.30+ min).. ..hence going for checksum (16 sec).

    SET ANSI_NULLS OFF --(application takes NULL=NULL as true) 

    SELECT COUNT(1) FROM Aad2fabbb_af57_47eb_be77_bfc688f5011f  IP

    INNER  JOIN Aad2fabbb_af57_47eb_be77_bfc688f5011f OP 

    ON (IP.F2796 =OP.F2796 OR (IP.F2796 = NULL AND OP.F2796 = NULL ))

    AND (IP.F2798 =OP.F2798 OR (IP.F2798 = NULL AND OP.F2798 =NULL ))

    AND (IP.F2800 =OP.F2800 OR (IP.F2800 = NULL AND OP.F2800 =NULL ))

    AND (IP.F2802 =OP.F2802 OR (IP.F2802 = NULL AND OP.F2802 =NULL ))

    AND (IP.F2804 =OP.F2804 OR (IP.F2804 = NULL AND OP.F2804 =NULL ))

    AND (IP.F2806 =OP.F2806 OR (IP.F2806 = NULL AND OP.F2806 =NULL ))

    AND (IP.F2808 =OP.F2808 OR (IP.F2808 = NULL AND OP.F2808 =NULL ))

    AND (IP.F2810 =OP.F2810 OR (IP.F2810 = NULL AND OP.F2810 =NULL ))

    AND (IP.F2812 =OP.F2812 OR (IP.F2812 = NULL AND OP.F2812 =NULL ))

    AND (IP.F2814 =OP.F2814 OR (IP.F2814 = NULL AND OP.F2814 =NULL ))

    AND  ( IP.F2824 BETWEEN OP.F2824 AND OP.F2825

            OR IP.F2825 BETWEEN OP.F2824 AND OP.F2825)

     

    I can have any number of joins, Instead Use checksum.. but order of columns coming in checksum is dynamic.  

    ALTER

    TABLE Aad2fabbb_af57_47eb_be77_bfc688f5011f ADD cs_keycols AS CHECKSUM (F2798,F2800,F2802,F2804,F2806,F2808,F2810,F2812,F2814); GO

    CREATE INDEX keycols_index ON Aad2fabbb_af57_47eb_be77_bfc688f5011f (cs_keycols);GO

    SELECT

    COUNT(1) FROM Aad2fabbb_af57_47eb_be77_bfc688f5011f_1 IP inner join Aad2fabbb_af57_47eb_be77_bfc688f5011f_1 OP on IP.cs_keycols = OP.cs_keycols AND ( IP.F2824 BETWEEN OP.F2824 AND OP.F2825 OR IP.F2825 BETWEEN OP.F2824 AND OP.F2825)

     

     

     

       

    Regards
    Shrikant Kulkarni

  • So are they all int columns? Is the domain the full range of int values?

    Tim Wilkinson

    "If it doesn't work in practice, you're using the wrong theory"
    - Immanuel Kant

  • Not really, the columns can be all Int/char or combinations of int and char.  

     

    Regards
    Shrikant Kulkarni

Viewing 5 posts - 1 through 4 (of 4 total)

You must be logged in to reply to this topic. Login to reply