Full text index search using one-character word

  • Hello!

    I'm using full text index search. There are few cases when for our customer it is important to find words even with only one character. Unfortunately it seems that SQL server ignores such words, although I haven't found assurance for that in docs.

    So here is the scenario. Suppose I want to find word 3. My table data are as follows:

    select * from test_fts;

    id txt

    ----------- --------------------------------------------------

    1 3 trees

    2 33 trees

    3 green trees

    Now I'm trying to find the word 3.

    select * from test_fts

    where contains(txt, '"3"')

    id txt

    ----------- --------------------------------------------------

    Informational: The full-text search condition contained noise word(s).

    select * from test_fts

    where contains(txt, '"3*"')

    id txt

    ----------- --------------------------------------------------

    2 33 trees

    OK 3 is a noise word. I edited noise word list, rebuilt catalog and now the warning about noise words is gone, although the query do not return expected rows.

    select * from test_fts

    where contains(txt, '"3"')

    id txt

    ----------- --------------------------------------------------

    select * from test_fts

    where contains(txt, '"3*"')

    id txt

    ----------- --------------------------------------------------

    2 33 trees

    So it seems that full text index simply ignores words with 1 character, but is this anywhere documented?

  • yep it's documented; what you are looking for is editing the "noise words" file;

    mine happens to be located here:

    C:\Program Files\Microsoft SQL Server\MSSQL.1\MSSQL\FTDataoiseenu.txt.

    if you look inside it, you'll see all single digit numbers and all single letters are excluded by default, as well as a lot of other words.

    edit the file, but be prepared to put any changes back, as they were added due to others researching and getting a lot of search results that really needed to be excluded instead.

    single letter searchs just return too much, it seems.

    Lowell


    --help us help you! If you post a question, make sure you include a CREATE TABLE... statement and INSERT INTO... statement into that table to give the volunteers here representative data. with your description of the problem, we can provide a tested, verifiable solution to your question! asking the question the right way gets you a tested answer the fastest way possible!

  • Yep the problem is "it seems so". I don't like the word "seems". I would like to find exact place in docs where it is written.

    In official description about "contains" here http://msdn.microsoft.com/en-US/library/ms187787(v=SQL.90).aspx nothing is said about that. I was not able to find any other approval as well that one letter words are ignored despite their non-existence in noise word list.

    Otherwise I feel that probably I have overlooked something...

  • I am getting same problem but its reverse

    I am getting my record if I am searching with Contains(group_name , '"Region 3 North"') but its not return my data if I am searching with Contains(group_name , '"Region 3 North*"')

    I have already created stopList and assign to my index like

    CREATE FULLTEXT INDEX ON [dbo].[tbl_temp]

    (

    City

    Language 1033,

    State

    Language 1033,

    Country

    Language 1033,

    Zip

    Language 1033,

    Group_Name

    Language 1033

    )

    KEY INDEX [ClusteredIndex-20140625-154702]

    WITH STOPLIST = [stoplistName]--SYSTEM

    GO

    Thanks in advance.

Viewing 4 posts - 1 through 3 (of 3 total)

You must be logged in to reply to this topic. Login to reply