Fuzzy Lookup???

  • Hi all,

    I have 2 tables, a Persons table (alot of records, where the surnames could be enter incorrectly due to human error) and a Surname table (has 10 records with correct surnames that I am looking for in the Persons table).

    How do I use the Fuzzy lookup to lookup on surnames that look the same.

    Eg. Petersen could look like peteiren, peterson, pterson.

    NB. There could be more one of the same surnames that appear in the Persons table.

  • For something like names, i will start by using TSQL Soundex.

    http://www.devx.com/enterprise/Article/43757

    Fuzzy Matching Process:

    If it didnt help lets try Fuzzy Matching. You can start by getting yourself friendly with the data.

    Developing a successful fuzzy matching process is a very custom development process. Its not exact science, but an art.

    Instead of fuzzy matching blindly on surnames, start looking for fields that sould be used for exact match.

    Like age\DOB, zip code.

    Keep those fields as an exact match and start fuzzy matching on surname. Adding more fields for fuzzy matching will improve you possibility to arrive at a better match. Please make the maximum number of possibility to 100 which seems to be the higher limit in the advanced property. Pick the record with the highest similarity score as you matching reference data.

    If you can identify patterns in the data like repeated mistakes, you can start massaging your source data to arrive at a better match.

    Improving the fuzzy matching process cannot be done in one cycle. It will take numerous iterations until you feel comfortable with the result. Once you reach the saturation, you have to meet with the users and let them know that you are expecting 2%(which ever is the analysis on a sample set telling you) error rate in the data. If they cant accept that and if you have reached the saturation point of fuzzy logic then , its time to develop a Exception UI and design a more complex fuzzy logic to push records with unacceptable similarity score to human interpretation.

    If

Viewing 2 posts - 1 through 1 (of 1 total)

You must be logged in to reply to this topic. Login to reply