Key Word Searches

  • Comments posted to this topic are about the item Key Word Searches

  • Does this make sense. Why would you not use the built in full text search. Its good enough to power sharepoints document search. its built in from 2005 on and can easily index any text field.

  • I agree with Kermit--does this technique offer any benefits over just using full-text search? You can even get SQL Express with full-text search capabilities now!

  • KermitTheRock (2/28/2011)


    Does this make sense. Why would you not use the built in full text search. Its good enough to power sharepoints document search. its built in from 2005 on and can easily index any text field.

    My thoughts exactly, why re-invent the wheel if full text search is readily available?

    Need an answer? No, you need a question
    My blog at https://sqlkover.com.
    MCSE Business Intelligence - Microsoft Data Platform MVP

  • All this have sense if your are going to search in non English text, or in the text with mixed language's words, and whant to support your own search rules: for example if you need to support synonyms: Baltic, Boltic, Boltik, Baltijske..

  • Hi

    Please include a link to where Jeff Moden posted his function.

    This search turned up this result.

    Is that where you got the function from?

    Thanks

    Andrew

  • AndrewJacksonZA (2/28/2011)


    Hi

    Please include a link to where Jeff Moden posted his function.

    This search turned up this result.

    Is that where you got the function from?

    Thanks

    Andrew

    That's likely it. I've only posted it a couple of times so far. As I said in the post, I'm still doing some testing on it for a couple of things including a performance curve in different environments.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • I was going to post that the result table and the related methods for this code article are nearly identical to what full-text-search (FTS) does but I see that folks beat me to it. 🙂

    I've not worked with FTS before so I don't know if this article brings anything extra to the table (yet) and I haven't tested the code in this article (yet) BUT... the article appears to be a reasonable study of how FTS may work behind the scenes to build the "index table" and how it uses special tables for "noise words", etc.

    Why is that good? Heh... because SQL Server isn't the only database in town. 😀 This article shows how it could be done either in other database engines which may not have an FTS capability or in a "custom database".

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Or in a lower, cheaper edition.

  • I've been using FTS on a view composed of several tables, and this method works pretty well, as I can return ranked results using "contains table". Not sure if the method above allows for ranked results?

    Also, you don't mention what happens if someone searches for "Hex Nuts" as opposed to "Hex Nut"...how do you account for that? I've resorted to stemming the search terms when the last characters are "s", "es", etc.

    Does anyone have any links for "predictive" search solutions in MSSQL, to catch misspellings, etc?

  • Actually, I wrote of this just a week ago. The Contains function is often not sufficient for end user search. It doesn't handle misspellings, suggested search, implied "AND", and crashes on use of a noise word.

    FTS is not simply an inverted index of words, but is dependent on the query pipelining that happens before the index is accessed.

  • KermitTheRock (2/28/2011)


    Actually, I wrote of this just a week ago. The Contains function is often not sufficient for end user search. It doesn't handle misspellings, suggested search, implied "AND", and crashes on use of a noise word.

    FTS is not simply an inverted index of words, but is dependent on the query pipelining that happens before the index is accessed.

    Hmmm...well I've worked some time on getting it to work reasonably well for a somewhat specialized database with lots of weird product names like "M&Ms". Not sure I understood what the consensus was on the previous thread, that "only google has the resources to do it", or "we only will do it when we're paid to do it"?

  • thx

    going to have to play with this. i have a huge database on a SQL 2005 server with full text indexing enabled and it can really be a PITA a lot of times. sometimes simple searches are extremely slow for some reason and inserting large amounts of data into a FT enabled table is a major FAIL at least with 2005. i have to disable FT, run my inserts and then enable it again. and maintaining the FTI is also a PITA as i've had a few corruptions due to the backup running during the FT maintenance times

  • alen teplitsky (2/28/2011)


    thx

    going to have to play with this. i have a huge database on a SQL 2005 server with full text indexing enabled and it can really be a PITA a lot of times. sometimes simple searches are extremely slow for some reason and inserting large amounts of data into a FT enabled table is a major FAIL at least with 2005. i have to disable FT, run my inserts and then enable it again. and maintaining the FTI is also a PITA as i've had a few corruptions due to the backup running during the FT maintenance times

    Just curious, how big is "huge"? I haven't had any of the above problems on 2005/8.

  • about 400GB or so for 30 days of data

    every day i have vb scripts dump security and application log data from domain controllers, sql and other servers into a central database. there are three tables with an average of 100 million rows for each one. i tried having one table per server but it was a major PITA with joins so i use one table for DC's, one for SQL servers and one for everything else. every morning there are SSRS reports that get fired off to people with app log errors, locked accounts, any data about anyone doing any kind of account manipulation in AD, etc.

Viewing 15 posts - 1 through 15 (of 38 total)

You must be logged in to reply to this topic. Login to reply