fine Tuning?

  • Actually, there is nothing wrong with your original SQL but think about what the requirement to compare every name and address to every other combination of name and address what this means in term of effort.

    If there are 50,000 rows after joining name to address then each of these rows need to be compared to 49,999 other rows for a match. This works out to 2,499,950,000 (yes, 2 and half billion ) comparisons that need to be made, so you should expect CPU to go to 100%.

    This will probably mean that you will have 50,000 table scans and since most DBAs think that more than one table scan resulting from a single statement is bad, you can understand why your DBA is upset.

    You could possibly gain some performance by remove non-alphabetic characters from the name and address only once.

    "Replace( .... , Name )" is a short hand for the real replace as you have coded.

    SELECT firm.ROW_ID, firm.NAME, addresses.Addr

    , Replace( .... , Name ) as NameFixed

    , Replace( ... , Address) as AddressFixed

    INTO #Work

    FROM firm

    JOIN addresses

    ON Firm.ROW_ID = addresses.PR_ADDR_ID

    Then self-join the #Work table to itself:

    select

    from #Work as A

    , #Work as B

    where DIFFERENCE(A.NameFixed, B.NameFixed) = 4

    OR DIFFERENCE(A.AddressFixed, B.AddressFixed) = 4

    SQL = Scarcely Qualifies as a Language

Viewing post 16 (of 15 total)

You must be logged in to reply to this topic. Login to reply