|
|
|
SSC-Enthusiastic
      
Group: General Forum Members
Last Login: Tuesday, April 09, 2013 1:17 PM
Points: 111,
Visits: 534
|
|
|
|
|
|
Old Hand
      
Group: General Forum Members
Last Login: Wednesday, April 07, 2010 2:37 PM
Points: 315,
Visits: 164
|
|
| Not a single reference to SOUNDEX? That doesn't seem right in a spelling suggestion article.
|
|
|
|
|
SSC-Enthusiastic
      
Group: General Forum Members
Last Login: Wednesday, March 28, 2012 8:30 AM
Points: 148,
Visits: 48
|
|
So add it! SOUNDEX doesn't cover, particularly, the typos angle that Bill seems to have concentrated on. Probably a lot better for the search results, given its sometimes 'strange' output.
Seriously though, if you incorporated this with some Ajax, and used that for dynamic spelling correction, like Word does, it would be very nifty.
I recently built an Ajax menu that works on the very simple concept of 'update as you type'. Given folks' ability for typos and fat-finger, this might just get added to second-guess them.
|
|
|
|
|
SSCertifiable
       
Group: Moderators
Last Login: Thursday, May 09, 2013 12:38 PM
Points: 6,462,
Visits: 1,384
|
|
|
|
|
|
SSC-Enthusiastic
      
Group: General Forum Members
Last Login: Thursday, July 22, 2010 8:59 AM
Points: 110,
Visits: 952
|
|
We went with the premise that people would rather find what they're looking for than be prompted with spelling corrections, so we wrote the search feature to join the product keywords to the misspellings as synonyms of the correct term, and match on whatever the customer entered. We monitor the misses between requested keywords and those already in the search table, so new misspellings are added from those seen in the wild. The programmatic permutations could be easily incorporated.
BTW, an added benefit of the "synonyms" notion is that we can make the plural form of the keyword a synonym of the singular so the content editors need only associate the singular keyword with the product. Also those keywords that are similar in concept can be aliased together so only one of that concept group needs to be applied to the product to allow any of the group to match. (different rank/weightings can be used to resolve exact matches higher in the results than alias matches)
|
|
|
|
|
SSC-Enthusiastic
      
Group: General Forum Members
Last Login: Monday, May 20, 2013 10:59 AM
Points: 129,
Visits: 224
|
|
| I think a hash table would be a much better implementation. It would be hugely faster, and if the word doesn't hash, it would be easy to grab all the words near the hash as suggestions.
|
|
|
|
|
Ten Centuries
      
Group: General Forum Members
Last Login: Wednesday, March 27, 2013 3:22 PM
Points: 1,276,
Visits: 1,112
|
|
tnolan (7/21/2008) Not a single reference to SOUNDEX? That doesn't seem right in a spelling suggestion article.
Soundex is horrible. And the SQL Server implementation doesn't match with the NARA Soundex standard anyway. If you want to do phonetic matching, I'd recommend starting with a better phonetic match algorithm. For spell-checking you'll get better accuracy using LCS, edit distance, or n-gram matching. Set-based operations work well with n-gram matching.
|
|
|
|
|
Ten Centuries
      
Group: General Forum Members
Last Login: Wednesday, March 27, 2013 3:22 PM
Points: 1,276,
Visits: 1,112
|
|
gcopeland (7/21/2008) I think a hash table would be a much better implementation. It would be hugely faster, and if the word doesn't hash, it would be easy to grab all the words near the hash as suggestions.
A ternary search tree might work even better (although it would be hard to develop in SQL without using CLR). In the TST words that are similar tend to "bunch" together as well. BTW, just wondering but what type of hash function would you use to get similar words to group together? If the function is too simple you're going to pay a penalty in access costs.
|
|
|
|
|
SSC-Enthusiastic
      
Group: General Forum Members
Last Login: Tuesday, April 09, 2013 1:17 PM
Points: 111,
Visits: 534
|
|
Regarding number of round trips... In my implementation of spell suggestion, I use a single stored procedure call which uses a single T-SQL statement using a set-based approach even though the search entry is often multiple words. The search entry goes in, empty or filled spelling suggestion message goes out.
Although the performance seems great, I'd like to take the chance later to go into the specifics of the stored procedure including the performance angle and see if Andy and some of the other performance gurus around here will share their optimization tips.
Thank you everybody for taking the time to read. Let me know if you find other helper table techniques!
Bill
Bill Nicolich: www.SQLFave.com. Daily tweet of what's new and interesting: AppendNow
|
|
|
|
|
Old Hand
      
Group: General Forum Members
Last Login: Wednesday, April 07, 2010 2:37 PM
Points: 315,
Visits: 164
|
|
| Sorry guys, my soundex suggestion was not meant entirely seriously. The typo function provided here is pretty well done, but I think adding phonetic search would add to its overall use. I also do not like SQL SOUNDEX very much. ;) http://microsoft.apress.com/index.php?id=72 or even a conversion of http://everything2.com/node/459981 could easily be worked in to the code here for phonetic search.
|
|
|
|