I'm working on a project for a client that uses the Fuzzy Grouping Transformation component to identify potential duplicate records in the system. It appears that you can define a query to use as the source of records look through... but as far as I can tell, the FGT uses that query to then look through the rest of the database and I need to restrict that to look at just a subset of the data in the database... sometimes that subset will be the whole database, other times it won't be (more often than not it'll be a true subset) ... Does anyone have any experience using the FGT? Is there a way to limit what the logic looks at? or do I need to look at everything first, then remove from the results data that isn't in the subset?
Carp... the Fuzzy Grouping transformation component uses the input pipeline to do it's searching doesn't it? Owch... that's going to make things a little more difficult.