It's an interesting approach and one that I wish I could use, however this doesn't allow me to establish conditional matching as well. Therefore, I'm attempting to do record grouping within T-SQL. While I can identify the matches easily enough for a single record, I'm having trouble doing so in bulk. I cannot seem to create unique group identifiers for each set of matches.
If I take each record one at a time and compare against all records, I get a reasonably accurate match rate, but it takes almost 24 hours to group 90,000 records (and that's just my sample size!!). I currently have 23 various tests with a decreasing confidence level to determine if records should be grouped together. I considered using a RANK() OVER partition method, but the ORDER BY portion is not precise enough to account for allowable variances (such as month and day being reversed in a DOB).
Does anyone have any suggestions?
...when ye are in the service of your fellow beings ye are only in the service of your God. -- Mosiah 2:17