• KermitTheRock (3/3/2011)


    Todd,

    This is why FTS is hard. splitting some text by whitespace, building a list of "words" and pointers back to the text isn't difficult.

    Knowing when and how to group a some characters as a word is hard, detecting mispellings is hard. MSSQL handles the first using its parsing methods, but does not handle the second at all. Perhaps all queries need to be spell checked before going to the index.

    Kermit,

    For the applications I was doing this for, misspellings and plural/singular weren't an issue.

    Take wine snobs ordering on-line. If they type in blanc (French) they don't want to see blanco (Spanish). If they had a typo, they wouldn't get anything back in some cases and realize that they had misspelled the word and re-enter the search criteria. Since the search is so bloody fast, they haven't complained yet about not finding misspelled words.

    Stores ordering auto parts. Some items are very different if plural. They type in brake drum (singular) and that's what they want to see - all items with brake drum in the description, which would be packaged as a single brake drum. This item is also sold as a set under a different item number. They type in brake drums and they see all items sold as a set.

    In the auto parts application especially the users were very pleased since the query using the LIKE operator used to take over 30 seconds to complete and they would see both single items and kits if they typed brake drum. Using the technique in the article the average time was 1.5 seconds to return the item list for them to pick from.

    Todd Fifield