• wow, thats going to be tough;

    the only thing i could think of was a combination of opc.three's example, and joining it against a list of common suffixes to find potential duplicates, but that of course is going an ongoing thing as you dig deeper into the data.

    something like this is what i thought might be a starting point:

    With MySampleData(CompanyName)

    AS

    (

    SELECT 'Costco' UNION ALL

    SELECT 'Costco LLC' UNION ALL

    SELECT 'Costco Whls' UNION ALL

    SELECT 'Home Interiors Malaga' UNION ALL

    SELECT 'Home Plumbing' UNION ALL

    SELECT 'Home Property Management' UNION ALL

    SELECT 'Home Realty' UNION ALL

    SELECT 'Home Svc'

    ),

    CommonSuffixes (val)

    AS

    (

    SELECT ' Inc' UNION ALL

    SELECT ' LLC' UNION ALL

    SELECT ' Company' UNION ALL

    SELECT ' Co'

    )

    SELECT

    ROW_NUMBER() OVER (PARTITION BY SOUNDEX(CompanyName) ORDER BY CompanyName) AS RW,

    SOUNDEX(CompanyName) AS SoundX,

    *

    FROM MySampleData

    LEFT OUTER JOIN CommonSuffixes

    ON CHARINDEX(CommonSuffixes.val,MySampleData.CompanyName) > 0

    --WHERE CommonSuffixes.val IS NOT NULL --(turns the LEFT OUTER into an inner join, i know)

    ORDER BY CompanyName,RW

    Lowell


    --help us help you! If you post a question, make sure you include a CREATE TABLE... statement and INSERT INTO... statement into that table to give the volunteers here representative data. with your description of the problem, we can provide a tested, verifiable solution to your question! asking the question the right way gets you a tested answer the fastest way possible!