Nice article Ira, kinda familiar too. :-)
JOEL-145858, don't take the content of the article out of context. This is one of MANY methods that you can use in matching, but it doesn't represent a complete matching solution. You would obviously find as many exact matches as possible first, as any kind of fuzzy matching is cost-prohibitive comparatively. Then, using the model Ira outlined, you can perform fuzzy matching against what remains unmatched. And using geographic elements for blocking and fuzzy matching only serves as one example; The same model would fit other elements. Example:
Demographic - Block on First 2 letters of last name, year of birth / fuzzy match on FirstName, LastName, DOB
Demo / Geo - Block on FirstName, DOB, State / Fuzzy Match on FirstName, LastName, Address, City, Zip
The whole point of the parallel blocks is to minimize your comparison set, and thereby the number of potential combinations. In matching solutions I have done using this exact method, multiple iterations of this model with different criteria served our matching needs very well.
P.S - the method is also highly scalable if you have the processing power and memory on your SSIS Box.
Joshua T. Lewis