• Update: Based on several excellent forum contribution’s I asked Michael Capes(friend with big brain) to review them and provide feedback. The result is a revision to the original script/article. The following is the result of his review and the code for for the article has been updated. The code in the article for the JaroWinkler algothrim has has been updated and new code is in the zip file TestJaroWinkler.ZIP an will be available soon. Any question please let me know.

    I believe the "expected" results that the forum poster gave are incorrect. I was looking through the pdf document from Winkler, that I believe the poster took his test set. The paper is titled: "AN APPLICATION OF THE FELLEGI-SUNTER MODEL OF RECORD LINKAGE TO THE 1990 U.S. DECENNIAL CENSUS" There's a table in that paper which matches the poster's comments. I don't beleive this table was produced with Jaro-Winkler. Instead, it looks like it was produced by another algorithm.

    Furthermore, I got the C# SimMetrics utility to run this morning. It's results also matched the earlier results from my Oracle and T-SQL test.

    I believe the revised T-SQL is correctly implementing the Jaro-Winkler algorthim. I've verified it against two other implementations, and all three results match.

    Ira

    [font="Comic Sans MS"]Ira Warren Whiteside[/font]