How to compare text columns

  • Hi All,

    I have a table that stores ASCII text in a column of type text in a SQL 2000 database.  I need to compare new records with existing in order to avoid duplicate entries.  The table has slightly over 20,000 records and grows at a rate of about 15-20 records per day.

    I understand that SQL Server 2000 doesn't allow something like

    if (NOT Exists(Select ID from the_table where the_text  = the_text))

    How can I compare text columns to avoid duplicates? 

    Thanks,

    Steve

  • Have you tried this script in the library:

    http://www.sqlservercentral.com/scripts/viewscript.asp?scriptid=1527


    Cheers,

    Diane

  • Thanks for your response Diane but I don't think this is giving me what I need.  Of the 20,000+ records in this table, only about 14,000 are unique.  However, using the function generated by the script you suggested creates an interger value which is unique for each record.  In other words, it doesn't identify any duplicates in the table.

    select distinct dbo.fn_CheckSum_Text(my_text) from my_table

    returns every record in the table.

    Am I doing something wrong or is there another way check text fields to ensure that duplicate records are prevented?

    Steve

  • Not especially efficient, but you can do

    WHERE CHARINDEX(, ) = 1

  • Hmmn, I put the greater-than / less-than brackets in last post, they were stripped out....

    WHERE CHARINDEX(The_Text, TextCol) = 1

  • I assume that the text you are comparing could be greater then 8000 char.  If not, a redefinition to varchar(8000) could solve your problem.

     

    or you could compare the first 8000 char using substring

  • Some of the records are considerably more than 8,000 characters .

  • Then you could always break the statement into multiple filters

    where substring(a.text, 1, 8000) = substring( b.text, 1, 8000)

      and substring(a.text, 8001, 8000) = substring( b.text, 8001, 8000)

      and substring(a.text, 16001, 8000) = substring( b.text, 16001, 8000)

    (dont forget the isnull around each or a check that datalength of each is right for substring size)

Viewing 8 posts - 1 through 7 (of 7 total)

You must be logged in to reply to this topic. Login to reply