SQL Clone
SQLServerCentral is supported by Redgate
 
Log in  ::  Register  ::  Not logged in
 
 
 


Keyword Searching in SQL Server


Keyword Searching in SQL Server

Author
Message
Michael Ahmadi
Michael Ahmadi
SSC Rookie
SSC Rookie (32 reputation)SSC Rookie (32 reputation)SSC Rookie (32 reputation)SSC Rookie (32 reputation)SSC Rookie (32 reputation)SSC Rookie (32 reputation)SSC Rookie (32 reputation)SSC Rookie (32 reputation)

Group: General Forum Members
Points: 32 Visits: 60
Comments posted here are about the content posted at http://www.sqlservercentral.com/columnists/mAhmadi/2875.asp
Wavesailor
Wavesailor
Grasshopper
Grasshopper (19 reputation)Grasshopper (19 reputation)Grasshopper (19 reputation)Grasshopper (19 reputation)Grasshopper (19 reputation)Grasshopper (19 reputation)Grasshopper (19 reputation)Grasshopper (19 reputation)

Group: General Forum Members
Points: 19 Visits: 13
Hi,

I'm a bit of a newbie - so I've seen SQL has the ability to do Full Text Searching. I've never used it but why did you not use it? Just curious as I'll need to implement something like this soon.

Thanks
AnJar
AnJar
Forum Newbie
Forum Newbie (9 reputation)Forum Newbie (9 reputation)Forum Newbie (9 reputation)Forum Newbie (9 reputation)Forum Newbie (9 reputation)Forum Newbie (9 reputation)Forum Newbie (9 reputation)Forum Newbie (9 reputation)

Group: General Forum Members
Points: 9 Visits: 1
If You want to refine the search You can put all articles into an n-dimensional vector-space, where n ist the number of all (distinct) keywords in Your (keyword-) table. Each Entry of Your keyword log (LogEntry_Keyword) can then be understood as a vector. Euclidean "near" vectors will then assumedly contain related content. Have fun
Michael Ahmadi
Michael Ahmadi
SSC Rookie
SSC Rookie (32 reputation)SSC Rookie (32 reputation)SSC Rookie (32 reputation)SSC Rookie (32 reputation)SSC Rookie (32 reputation)SSC Rookie (32 reputation)SSC Rookie (32 reputation)SSC Rookie (32 reputation)

Group: General Forum Members
Points: 32 Visits: 60
Don't get the impression that the methodology described here is by any means a substitute for full text search - it certainly is not. In the article we are simply defining a keyword as the text between space characters and we are in control of what tags we use to describe a log entry. In a real-world scenario you are more likely going to need the abilities of the full text engine's parser (word breaker) as well as the efficiency and versatility of full text querying. This is in fact a self-contained solution to the problem of keyword search, but it is a bare-bones solution. I use it mainly for keeping track of information that can be described with a handful of keywords.

Mike
Mike C
Mike C
SSCarpal Tunnel
SSCarpal Tunnel (4.6K reputation)SSCarpal Tunnel (4.6K reputation)SSCarpal Tunnel (4.6K reputation)SSCarpal Tunnel (4.6K reputation)SSCarpal Tunnel (4.6K reputation)SSCarpal Tunnel (4.6K reputation)SSCarpal Tunnel (4.6K reputation)SSCarpal Tunnel (4.6K reputation)

Group: General Forum Members
Points: 4577 Visits: 1172
This is an interesting idea, and I think I get the author's point - why use the FTS "sledgehammer" for simpler tasks that don't require all the extra functionality? One thing that might add value to the tokenization/matching the author presents when compared to FTS is the ability to do approximate matching: phonetic, edit distance, n-gram, or common substring matching (or maybe some combination of these). You can actually get very good performance and good accuracy matches from a set-based n-gram solution.
Tao Klerks
Tao Klerks
SSCommitted
SSCommitted (1.8K reputation)SSCommitted (1.8K reputation)SSCommitted (1.8K reputation)SSCommitted (1.8K reputation)SSCommitted (1.8K reputation)SSCommitted (1.8K reputation)SSCommitted (1.8K reputation)SSCommitted (1.8K reputation)

Group: General Forum Members
Points: 1771 Visits: 1249
This is a brilliant concept, and I'm amazed I have not seen it anywhere else!

There are some known issues with full-text search:
- when searching, you need to look for "Words", you cannot use arbitrary substrings
- you cannot (easily?) "partition/order" a full-text index by a key, eg "UserID" or "ClientID" - in a shared-tenant architecture (SaaS environment with multiple/many clients in a single DB) this can be a very serious issue!
- Administration/Maintenance is very painful in SQL Server 2000 and earlier (have not tried 2005 but reputedly much better)

If instead of using a Trigger to do the "tokenizing" in this solution you used a scheduled job, along with trigger to maintain an "UpdateRequired" flag of some sort on the record, for the job to look at, you would basically be building your own "text search light" system, suitable for all sorts of uses...

It does have major disadvantages of course:
- will use much more space for the tokens than full-text search would
- will be less efficient when tokenizing
- will be less powerful when tokenizing (no word root identification etc)
- will probably/possibly be slower when returning matches on entire table (but faster on subset by a key that you specify)

All in all a great option to keep in mind though I think - does anyone see other major disadvantages (or advantages) that I am missing?

Thanks,
Tao

http://poorsql.com for T-SQL formatting: free as in speech, free as in beer, free to run in SSMS or on your version control server - free however you want it.
Sam Ellis
Sam Ellis
SSC Rookie
SSC Rookie (37 reputation)SSC Rookie (37 reputation)SSC Rookie (37 reputation)SSC Rookie (37 reputation)SSC Rookie (37 reputation)SSC Rookie (37 reputation)SSC Rookie (37 reputation)SSC Rookie (37 reputation)

Group: General Forum Members
Points: 37 Visits: 139
There appears to be a bug in the search stored procedure ... it was only searching on the first term given to it in the search string.

It was also eroding the string by one character every loop, so would break after the padding string was totally eroded away

To solve the problems change:

The line:
SET @kws = SUBSTRING(@kw, CHARINDEX(' ', @kws) + 1, LEN(@kws) - CHARINDEX(' ', @kws) - 1)

To this (remembering to remove the -1 at the end):
SET @kws = SUBSTRING(@kws, CHARINDEX(' ', @kws) + 1, LEN(@kws) - CHARINDEX(' ', @kws) )
Tom3w
Tom3w
SSC Veteran
SSC Veteran (201 reputation)SSC Veteran (201 reputation)SSC Veteran (201 reputation)SSC Veteran (201 reputation)SSC Veteran (201 reputation)SSC Veteran (201 reputation)SSC Veteran (201 reputation)SSC Veteran (201 reputation)

Group: General Forum Members
Points: 201 Visits: 60
w00t fascinating concept.

Which versions of SQL is this intended for? I am getting error from SQL Query Analyzer 2000 when try to execute the trigger section on a db running in SQL Server 2005 :

Server: Msg 207, Level 16, State 1, Procedure trgInsertLogEntry, Line 21
Invalid column name 'tags'.

Points to line:
SET @tags = (SELECT tags FROM INSERTED)

Where did table "INSERTED" get made?



Go


Permissions

You can't post new topics.
You can't post topic replies.
You can't post new polls.
You can't post replies to polls.
You can't edit your own topics.
You can't delete your own topics.
You can't edit other topics.
You can't delete other topics.
You can't edit your own posts.
You can't edit other posts.
You can't delete your own posts.
You can't delete other posts.
You can't post events.
You can't edit your own events.
You can't edit other events.
You can't delete your own events.
You can't delete other events.
You can't send private messages.
You can't send emails.
You can read topics.
You can't vote in polls.
You can't upload attachments.
You can download attachments.
You can't post HTML code.
You can't edit HTML code.
You can't post IFCode.
You can't post JavaScript.
You can post emoticons.
You can't post or upload images.

Select a forum

































































































































































SQLServerCentral


Search