Thesaurus Files

  • Comments posted to this topic are about the item Thesaurus Files

  • From BOL 2008 Thesaurus Configuration

    ms-help://MS.SQLCC.v10/MS.SQLSVR.v10.en/s10de_1devconc/html/3ef96a63-8a52-45be-9a1f-265bff400e54.htm

    You can define two forms of synonyms, expansion sets and replacement sets. By developing a thesaurus tailored to your full-text data, you can effectively broaden the scope of full-text queries on that data.

    Expansion set

    An expansion set contains a group of synonyms such as "writer", "author", and "journalist" that are substituted for one another by a full-text query. Queries that contain a match for any synonym in an expansion set are expanded to include every other synonym in the expansion set.

    Some one explain to me how this supports the supposed correct answer.

    If everything seems to be going well, you have obviously overlooked something.

    Ron

    Please help us, help you -before posting a question please read[/url]
    Before posting a performance problem please read[/url]

  • A thesaurus query uses both a language-specific thesaurus and the global thesaurus. First, the query looks up the language-specific file and loads it for processing (unless it is already loaded). The query is expanded to include the language-specific synonyms specified by the expansion set and replacement set rules in the thesaurus file. These steps are then repeated for the global thesaurus. However, if a term is already part of a match in the language specific thesaurus file, the term is ineligible for matching in the global thesaurus.

    reference site

    http://msdn.microsoft.com/en-us/library/ms142491.aspx

  • bitbucket (10/10/2008)


    From BOL 2008 Thesaurus Configuration

    ms-help://MS.SQLCC.v10/MS.SQLSVR.v10.en/s10de_1devconc/html/3ef96a63-8a52-45be-9a1f-265bff400e54.htm

    You can define two forms of synonyms, expansion sets and replacement sets. By developing a thesaurus tailored to your full-text data, you can effectively broaden the scope of full-text queries on that data.

    Expansion set

    An expansion set contains a group of synonyms such as "writer", "author", and "journalist" that are substituted for one another by a full-text query. Queries that contain a match for any synonym in an expansion set are expanded to include every other synonym in the expansion set.

    Some one explain to me how this supports the supposed correct answer.

    The key word here is "recursively". Once a match is found in the thesaurus SQL Server stops searching for further matches. For instance, consider a thesaurus that defines the following expansion set:

    <expansion>

    <sub>fl</sub>

    <sub>fla</sub>

    <sub>florida</sub>

    </expansion>

    And this expansion set:

    <expansion>

    <sub>fl</sub>

    <sub>fluid</sub>

    </expansion>

    A search for "fl" will search for "fl", "fla", and "florida". FTS will stop at the first expansion set and won't continue to expand "fl" to "fluid" as defined by the second expansion set.

    As for the recursive part, consider a search for "fla". Again FTS will expand this to search for "fl", "fla", and "florida". It will not recursively search for more expansion or replacement rules, so "fl" will not be expanded out to "fluid" this time either.

  • Mike C.

    Thanks for that detailed explanation. Explains a lot and I will have to remember that when creating a thesaurus.

    Again thank you.

    If everything seems to be going well, you have obviously overlooked something.

    Ron

    Please help us, help you -before posting a question please read[/url]
    Before posting a performance problem please read[/url]

  • Mike great explaination

  • In my opinon the great disadvantage of MS thesaurus implementaion is that sql server does not use forms of thesaurus words.

    If user wants to have "full functional synonym" he should add expansion for all forms of words.

    For example:

    Tabke contains records with forms of word "source", I added synonyms

    source

    lake

    river

    drink

    In this case if I search word "rivers" or "drank" I never find these records(!).

    But if I added yet expansion:

    drink

    drank

    drunk

    drinking

    drinker

    drinkeress

    drinkable

    In this case I search "drank" and I can find records with "source".

    Regards.

  • Hi Oleg,

    You could use the sys.dm_fts_parser to retrieve the expansion sets from SQL Server and use that to recreate a query string that includes all your inflectional and thesaurus word forms. I haven't tested the performance of sys.dm_fts_parser (since I usually use it for one-off testing in Management Studio), so there might be performance implications.

    Thanks

    Mike C

Viewing 8 posts - 1 through 7 (of 7 total)

You must be logged in to reply to this topic. Login to reply