A problem with using full text search with wildcard query

  • Hi,

    I have a weird case with the full text search in Sql Server 2008 R2 SP1 web edition.

    I have a table that has a row with this value in the indexed field (mapped to a catalog):

    logitech wireless keyboard case for ipad eran

    When I query like that, I do get this row as expected:

    SELECT * FROM TBL_NAME

    WHERE CONTAINS (field_name,'"logitech wireless keyboard case for ipad"')

    I don't unserstand why when I run the following query, I do not get any result:

    SELECT * FROM TBL_NAME

    WHERE CONTAINS (field_name,'"logitech wireless keyboard case for ipad*"')

    I add the wildcard asterisk after the query phrase, in order to get all the rows that contains the queried phrase,abd that's why I build the string parameter for the CONTAINS query like that:

    ' + " + the queries string + * + " + '

    Isn't that the right way to do that?

    And if it is, then why does this row is not returned in this query?

    Any advises please?

    Thanks!

    Eran.

  • I'm not sure why you need the * here.

    * is supposed to be a prefixing term, so "ipad*" should match words that start with "ipad", e.g. "ipady", "ipadness", "ipadable" (none of which are in your row [nor are real words!]). From what I read in BOL, it should match zero, one or more characters, so this should also return "ipad" (and I am not completely sure why it does not).

    When used in a multi-word query, it gets even weirder: each word in the query term is treated like a prefix, so "logitech wireless keyboard case for ipad*" could actually match "logitechy wirelessing keyboards forty ipadooble", which is probably not what you are after.

    I would probably go one of two routes here, depending on how 'clever' I wanted the search to feel:

    1. consider the FREETEXT command for this, rather than CONTAINS.

    2. pre-process my search term to replace spaces with NEAR and use CONTAINSTABLE to determine how likely a match it was.

    An example below for the second option, which will pull back the top 5 matching rows:

    DECLARE @top_n INT = 5;

    DECLARE @search_term VARCHAR(MAX) = '("logitech*" NEAR "wireless*" NEAR "keyboard*" NEAR "case*" NEAR "for*" NEAR "ipad*")';

    SELECT t.*

    FROM tbl_name t

    INNER JOIN CONTAINSTABLE(tbl_name, field_name, @search_term, @top_n) ct

    ON t.pkid = ct.key;

    Notes:

    1. I've not tested the above code, at all. Use at your own risk.

    2. If the syntax does not support variables, you will have to expand this yourself.

    3. I would expect you to pre-process the search term in your application layer, but there are methods to do so in SQL, if you have to.

    4. You will want to make sure your query does not exist solely of full-text noise words -- I will leave that as an exercise to you.

    5. Since you probably don't want to match "logitechy wirelessing keyboards forty ipadooble", you might prefer to pre-process without the *. I would still quote, though, to reduce confusion between keyword NEAR and any search term "near" (or you could use ~ in place of NEAR).

  • jimbobmcgee (12/3/2012)


    I'm not sure why you need the * here.

    * is supposed to be a prefixing term, so "ipad*" should match words that start with "ipad", e.g. "ipady", "ipadness", "ipadable" (none of which are in your row [nor are real words!]). From what I read in BOL, it should match zero, one or more characters, so this should also return "ipad" (and I am not completely sure why it does not).

    When used in a multi-word query, it gets even weirder: each word in the query term is treated like a prefix, so "logitech wireless keyboard case for ipad*" could actually match "logitechy wirelessing keyboards forty ipadooble", which is probably not what you are after.

    I would probably go one of two routes here, depending on how 'clever' I wanted the search to feel:

    1. consider the FREETEXT command for this, rather than CONTAINS.

    2. pre-process my search term to replace spaces with NEAR and use CONTAINSTABLE to determine how likely a match it was.

    An example below for the second option, which will pull back the top 5 matching rows:

    DECLARE @top_n INT = 5;

    DECLARE @search_term VARCHAR(MAX) = '("logitech*" NEAR "wireless*" NEAR "keyboard*" NEAR "case*" NEAR "for*" NEAR "ipad*")';

    SELECT t.*

    FROM tbl_name t

    INNER JOIN CONTAINSTABLE(tbl_name, field_name, @search_term, @top_n) ct

    ON t.pkid = ct.key;

    Notes:

    1. I've not tested the above code, at all. Use at your own risk.

    2. If the syntax does not support variables, you will have to expand this yourself.

    3. I would expect you to pre-process the search term in your application layer, but there are methods to do so in SQL, if you have to.

    4. You will want to make sure your query does not exist solely of full-text noise words -- I will leave that as an exercise to you.

    5. Since you probably don't want to match "logitechy wirelessing keyboards forty ipadooble", you might prefer to pre-process without the *. I would still quote, though, to reduce confusion between keyword NEAR and any search term "near" (or you could use ~ in place of NEAR).

    Thanks for your answer jimbobmcgee.

    My need is to change an old

    LIKE 'search text%'

    query

    to a full text search query in order to improve performence,

    while keeping the search results as close as possible to what I got till now with the old LIKE query.

    I don't have a problem to manipulate the @search_term parameter before I

    send it to the store procedure.

    So, if my old store procedure produced this query:

    SELECT * FROM TBL_NAME

    WHERE FIELD_NAME LIKE 'logitech wireless keyboard case for ipad%'

    Is it possible to use a full text search query and get excactly the same results as I got before with the old query?

    If it is, what is the correct way to do so? And if it is not possible,

    what would give me the closest results (I would rather have more results, not less then before)?

    Thanks!

    Eran

Viewing 3 posts - 1 through 2 (of 2 total)

You must be logged in to reply to this topic. Login to reply