Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 
        
Home       Members    Calendar    Who's On


Add to briefcase

Keyword Searching in SQL Server Expand / Collapse
Author
Message
Posted Friday, February 16, 2007 1:45 PM
Forum Newbie

Forum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum Newbie

Group: General Forum Members
Last Login: Wednesday, January 18, 2012 1:06 PM
Points: 8, Visits: 60
Comments posted here are about the content posted at http://www.sqlservercentral.com/columnists/mAhmadi/2875.asp
Post #345603
Posted Wednesday, April 11, 2007 10:11 AM
Forum Newbie

Forum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum Newbie

Group: General Forum Members
Last Login: Friday, June 1, 2012 2:02 AM
Points: 5, Visits: 13
Hi,

I'm a bit of a newbie - so I've seen SQL has the ability to do Full Text Searching. I've never used it but why did you not use it? Just curious as I'll need to implement something like this soon.

Thanks
Post #357559
Posted Wednesday, April 11, 2007 2:53 PM
Forum Newbie

Forum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum Newbie

Group: General Forum Members
Last Login: Monday, June 11, 2007 3:31 AM
Points: 1, Visits: 1
If You want to refine the search You can put all articles into an n-dimensional vector-space, where n ist the number of all (distinct) keywords in Your (keyword-) table. Each Entry of Your keyword log (LogEntry_Keyword) can then be understood as a vector. Euclidean "near" vectors will then assumedly contain related content. Have fun
Post #357651
Posted Friday, April 13, 2007 1:05 PM
Forum Newbie

Forum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum Newbie

Group: General Forum Members
Last Login: Wednesday, January 18, 2012 1:06 PM
Points: 8, Visits: 60
Don't get the impression that the methodology described here is by any means a substitute for full text search - it certainly is not. In the article we are simply defining a keyword as the text between space characters and we are in control of what tags we use to describe a log entry. In a real-world scenario you are more likely going to need the abilities of the full text engine's parser (word breaker) as well as the efficiency and versatility of full text querying. This is in fact a self-contained solution to the problem of keyword search, but it is a bare-bones solution. I use it mainly for keeping track of information that can be described with a handful of keywords.

Mike
Post #358365
Posted Wednesday, February 20, 2008 10:40 PM


Ten Centuries

Ten CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen Centuries

Group: General Forum Members
Last Login: Monday, April 14, 2014 4:18 PM
Points: 1,276, Visits: 1,132
This is an interesting idea, and I think I get the author's point - why use the FTS "sledgehammer" for simpler tasks that don't require all the extra functionality? One thing that might add value to the tokenization/matching the author presents when compared to FTS is the ability to do approximate matching: phonetic, edit distance, n-gram, or common substring matching (or maybe some combination of these). You can actually get very good performance and good accuracy matches from a set-based n-gram solution.
Post #458360
Posted Thursday, February 21, 2008 6:01 AM
Ten Centuries

Ten CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen Centuries

Group: General Forum Members
Last Login: Friday, July 25, 2014 6:37 AM
Points: 1,385, Visits: 1,242
This is a brilliant concept, and I'm amazed I have not seen it anywhere else!

There are some known issues with full-text search:
- when searching, you need to look for "Words", you cannot use arbitrary substrings
- you cannot (easily?) "partition/order" a full-text index by a key, eg "UserID" or "ClientID" - in a shared-tenant architecture (SaaS environment with multiple/many clients in a single DB) this can be a very serious issue!
- Administration/Maintenance is very painful in SQL Server 2000 and earlier (have not tried 2005 but reputedly much better)

If instead of using a Trigger to do the "tokenizing" in this solution you used a scheduled job, along with trigger to maintain an "UpdateRequired" flag of some sort on the record, for the job to look at, you would basically be building your own "text search light" system, suitable for all sorts of uses...

It does have major disadvantages of course:
- will use much more space for the tokens than full-text search would
- will be less efficient when tokenizing
- will be less powerful when tokenizing (no word root identification etc)
- will probably/possibly be slower when returning matches on entire table (but faster on subset by a key that you specify)

All in all a great option to keep in mind though I think - does anyone see other major disadvantages (or advantages) that I am missing?

Thanks,
Tao


http://poorsql.com for T-SQL formatting: free as in speech, free as in beer, free to run in SSMS or on your version control server - free however you want it.
Post #458504
Posted Friday, February 22, 2008 10:19 AM


Grasshopper

GrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopper

Group: General Forum Members
Last Login: Wednesday, March 19, 2014 8:16 AM
Points: 15, Visits: 139
There appears to be a bug in the search stored procedure ... it was only searching on the first term given to it in the search string.

It was also eroding the string by one character every loop, so would break after the padding string was totally eroded away

To solve the problems change:

The line:
SET @kws = SUBSTRING(@kw, CHARINDEX(' ', @kws) + 1, LEN(@kws) - CHARINDEX(' ', @kws) - 1)

To this (remembering to remove the -1 at the end):
SET @kws = SUBSTRING(@kws, CHARINDEX(' ', @kws) + 1, LEN(@kws) - CHARINDEX(' ', @kws) )
Post #459201
Posted Monday, November 10, 2008 3:16 PM
SSC Rookie

SSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC Rookie

Group: General Forum Members
Last Login: Thursday, March 1, 2012 6:46 AM
Points: 37, Visits: 60
fascinating concept.

Which versions of SQL is this intended for? I am getting error from SQL Query Analyzer 2000 when try to execute the trigger section on a db running in SQL Server 2005 :

Server: Msg 207, Level 16, State 1, Procedure trgInsertLogEntry, Line 21
Invalid column name 'tags'.

Points to line:
SET @tags = (SELECT tags FROM INSERTED)

Where did table "INSERTED" get made?



Post #600257
« Prev Topic | Next Topic »

Add to briefcase

Permissions Expand / Collapse