Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 
        
Home       Members    Calendar    Who's On


Add to briefcase 12»»

Owning the Spelling Suggestion Feature Expand / Collapse
Author
Message
Posted Saturday, July 19, 2008 11:12 AM


SSC-Enthusiastic

SSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-Enthusiastic

Group: General Forum Members
Last Login: Thursday, December 12, 2013 1:09 PM
Points: 111, Visits: 541
Comments posted to this topic are about the item Owning the Spelling Suggestion Feature

Bill Nicolich: www.SQLFave.com.
Daily tweet of what's new and interesting: AppendNow
Post #537168
Posted Monday, July 21, 2008 1:30 AM


Old Hand

Old HandOld HandOld HandOld HandOld HandOld HandOld HandOld Hand

Group: General Forum Members
Last Login: Wednesday, April 7, 2010 2:37 PM
Points: 315, Visits: 164
Not a single reference to SOUNDEX? That doesn't seem right in a spelling suggestion article.
Post #537451
Posted Monday, July 21, 2008 1:44 AM
SSC-Enthusiastic

SSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-Enthusiastic

Group: General Forum Members
Last Login: Thursday, September 26, 2013 3:08 AM
Points: 148, Visits: 49
So add it!
SOUNDEX doesn't cover, particularly, the typos angle that Bill seems to have concentrated on. Probably a lot better for the search results, given its sometimes 'strange' output.

Seriously though, if you incorporated this with some Ajax, and used that for dynamic spelling correction, like Word does, it would be very nifty.

I recently built an Ajax menu that works on the very simple concept of 'update as you type'. Given folks' ability for typos and fat-finger, this might just get added to second-guess them.




Post #537457
Posted Monday, July 21, 2008 5:52 AM
SSCertifiable

SSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiable

Group: Moderators
Last Login: Today @ 6:56 AM
Points: 6,804, Visits: 1,934
My concern would be the number of round trips that might get generated. Other than that, I've always used a 3rd party spell checker (including the one in Word at times) and while they weren't cheap, they typically aren't hugely expensive either.

Andy
SQLAndy - My Blog!
Connect with me on LinkedIn
Follow me on Twitter
Post #537591
Posted Monday, July 21, 2008 7:04 AM
SSC-Enthusiastic

SSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-Enthusiastic

Group: General Forum Members
Last Login: Thursday, July 22, 2010 8:59 AM
Points: 110, Visits: 952
We went with the premise that people would rather find what they're looking for than be prompted with spelling corrections, so we wrote the search feature to join the product keywords to the misspellings as synonyms of the correct term, and match on whatever the customer entered. We monitor the misses between requested keywords and those already in the search table, so new misspellings are added from those seen in the wild. The programmatic permutations could be easily incorporated.

BTW, an added benefit of the "synonyms" notion is that we can make the plural form of the keyword a synonym of the singular so the content editors need only associate the singular keyword with the product. Also those keywords that are similar in concept can be aliased together so only one of that concept group needs to be applied to the product to allow any of the group to match. (different rank/weightings can be used to resolve exact matches higher in the results than alias matches)
Post #537635
Posted Monday, July 21, 2008 8:49 AM
SSC-Enthusiastic

SSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-Enthusiastic

Group: General Forum Members
Last Login: Wednesday, December 10, 2014 9:48 AM
Points: 196, Visits: 300
I think a hash table would be a much better implementation. It would be hugely faster, and if the word doesn't hash, it would be easy to grab all the words near the hash as suggestions.
Post #537706
Posted Monday, July 21, 2008 8:51 AM


Ten Centuries

Ten CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen Centuries

Group: General Forum Members
Last Login: Wednesday, September 24, 2014 1:20 PM
Points: 1,276, Visits: 1,135
tnolan (7/21/2008)
Not a single reference to SOUNDEX? That doesn't seem right in a spelling suggestion article.


Soundex is horrible. And the SQL Server implementation doesn't match with the NARA Soundex standard anyway. If you want to do phonetic matching, I'd recommend starting with a better phonetic match algorithm. For spell-checking you'll get better accuracy using LCS, edit distance, or n-gram matching. Set-based operations work well with n-gram matching.
Post #537708
Posted Monday, July 21, 2008 8:53 AM


Ten Centuries

Ten CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen Centuries

Group: General Forum Members
Last Login: Wednesday, September 24, 2014 1:20 PM
Points: 1,276, Visits: 1,135
gcopeland (7/21/2008)
I think a hash table would be a much better implementation. It would be hugely faster, and if the word doesn't hash, it would be easy to grab all the words near the hash as suggestions.


A ternary search tree might work even better (although it would be hard to develop in SQL without using CLR). In the TST words that are similar tend to "bunch" together as well. BTW, just wondering but what type of hash function would you use to get similar words to group together? If the function is too simple you're going to pay a penalty in access costs.
Post #537712
Posted Monday, July 21, 2008 4:32 PM


SSC-Enthusiastic

SSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-Enthusiastic

Group: General Forum Members
Last Login: Thursday, December 12, 2013 1:09 PM
Points: 111, Visits: 541
Regarding number of round trips... In my implementation of spell suggestion, I use a single stored procedure call which uses a single T-SQL statement using a set-based approach even though the search entry is often multiple words. The search entry goes in, empty or filled spelling suggestion message goes out.

Although the performance seems great, I'd like to take the chance later to go into the specifics of the stored procedure including the performance angle and see if Andy and some of the other performance gurus around here will share their optimization tips.

Thank you everybody for taking the time to read. Let me know if you find other helper table techniques!

Bill


Bill Nicolich: www.SQLFave.com.
Daily tweet of what's new and interesting: AppendNow
Post #538057
Posted Monday, July 21, 2008 5:41 PM


Old Hand

Old HandOld HandOld HandOld HandOld HandOld HandOld HandOld Hand

Group: General Forum Members
Last Login: Wednesday, April 7, 2010 2:37 PM
Points: 315, Visits: 164
Sorry guys, my soundex suggestion was not meant entirely seriously. The typo function provided here is pretty well done, but I think adding phonetic search would add to its overall use. I also do not like SQL SOUNDEX very much. ;) http://microsoft.apress.com/index.php?id=72 or even a conversion of http://everything2.com/node/459981 could easily be worked in to the code here for phonetic search.
Post #538070
« Prev Topic | Next Topic »

Add to briefcase 12»»

Permissions Expand / Collapse