Full Text Indexing - Text Parsing Routine

  • Comments posted to this topic are about the content posted at http://www.sqlservercentral.com/columnists/ckempster/fulltextindexingtextparsingroutine.asp


    Chris Kempster
    www.chriskempster.com
    Author of "SQL Server Backup, Recovery & Troubleshooting"
    Author of "SQL Server 2k for the Oracle DBA"

  • 1. How about the following condition:

    ISABOUT("drive safe" WEIGHT (.8), "drive" NEAR "safe" WEIGHT (.7),

    "drive*" NEAR "safe*" WEIGHT (.6), FORMSOF(INFLECTIONAL,"drive","safe") .5)

    Would it be better ?

    2. What is the idea with "é type characters" ?

    Razvan

    Edited by - rsocol on 12/11/2003 12:57:56 AM

  • Hi there

    Very nice indeed, I will alter the routine with a new parameter for this option and see how it goes in test.

    Thanks again.

    Cheers

    Ck

    Chris Kempster

    http://www.chriskempster.com

    Author of "SQL Server 2k for the Oracle DBA"


    Chris Kempster
    www.chriskempster.com
    Author of "SQL Server Backup, Recovery & Troubleshooting"
    Author of "SQL Server 2k for the Oracle DBA"

  • For those interested, alter the final @search assignment to this:

    SELECT @search = 'ISABOUT("' + @rawsearch + '" WEIGHT(.8), ' + @search2 + ' WEIGHT (.7), ' + @search + ' WEIGHT(.6)) OR (' + @fuzzy + ')'

    and it will change it to this:

    ISABOUT("drive safe" WEIGHT(.8), "drive" near "safe" WEIGHT (.7), "drive*" near "safe*" WEIGHT(.6)) OR (FORMSOF(INFLECTIONAL,"drive") AND FORMSOF(INFLECTIONAL,"safe"))

    Note that:

    FORMSOF(INFLECTIONAL,"drive","safe")

    does an OR, which cant be used in the above statement, only AND between formsof statements.

    Cheers

    Ck

    Chris Kempster

    http://www.chriskempster.com

    Author of "SQL Server 2k for the Oracle DBA"


    Chris Kempster
    www.chriskempster.com
    Author of "SQL Server Backup, Recovery & Troubleshooting"
    Author of "SQL Server 2k for the Oracle DBA"

  • With the é (accented characters). No matter the language settings for the character fields (ie. accent insensitive), we found the é and other such characters wouldnt translate, so café wouldnt come up if we searched for cafe for example, this was a real pain to deal with, therefore stripping was required.

    If someone has a solution for this, id love to hear it.

    Cheers

    Ck

    Chris Kempster

    http://www.chriskempster.com

    Author of "SQL Server 2k for the Oracle DBA"


    Chris Kempster
    www.chriskempster.com
    Author of "SQL Server Backup, Recovery & Troubleshooting"
    Author of "SQL Server 2k for the Oracle DBA"

  • Very much appreciated. I've not yet done a lot of testing or looked at the logic in detail, however, I did notice something peculiar. If the input parameter is varchar(500), how can the output parameter also be varchar(500)? If I indeed send all 500 characters in, it seems the result coming back would be truncated.

    This is just a minor detail, but it could create problem that could be hard to diagnose.

  • hi there, very observant actually, this is a bug, we altered it just the other day to 2000, the size really blows out for large search strings... thanks again.

    Chris Kempster

    http://www.chriskempster.com

    Author of "SQL Server 2k for the Oracle DBA"


    Chris Kempster
    www.chriskempster.com
    Author of "SQL Server Backup, Recovery & Troubleshooting"
    Author of "SQL Server 2k for the Oracle DBA"

  • Hi, I really liked this code 🙂

    I've ported it into C# to give me more control over the search term, I'd love to know what you guys think 🙂

    Check it out: How to build an SQL full text index search term in c#[/url]

    Trull

    muonlab web development blog

Viewing 8 posts - 1 through 7 (of 7 total)

You must be logged in to reply to this topic. Login to reply