Problem with Full-Text Searching

  • Hi all,

    I've got a database with full-text search enabled. Within that database, I have a table with two columns that are being indexed.

    One column is a VARBINARY(MAX) table, and is storing the content of documents of various types. The full-text index on that table references that column, as well as another column with the file type of the document being stored.

    The other column is an NVARCHAR(MAX) column, storing the plain text contained within the document, parsed using various parsers, in the event that the document doesn't contain xml-formatted content. This is specifically done for PDFs.

    The problem that I'm having is that, for some reason, the NVARCHAR(MAX) column isn't being indexed by the full-text engine. When using the query


    SELECT * FROM
    sys.dm_fts_index_keywords_by_document
    (
        DB_ID('DBName'),    
        OBJECT_ID('TableName')
    );
     

    The only records which show up in the catalog are those from the VARBINARY(MAX) column. There are no keywords obtained from the content of the NVARCHAR(MAX) column.

    I've tried rebuilding the catalog many times, I've even tried dropping the index on the table and re-creating it. Nothing seems to make the database index that column.

    Any ideas as to why this isn't working? Any information that I could provide to help? Here's the code I used to create the full-text index on that table:


    USE [DB]
    GO
    CREATE FULLTEXT INDEX ON [dbo].[Table] KEY INDEX [PK] ON ([Catalog]) WITH (CHANGE_TRACKING AUTO)
    GO
    USE [DB]
    GO
    ALTER FULLTEXT INDEX ON [dbo].[Table] ADD ([NVARCHARMAXColumn] LANGUAGE [English])
    GO
    USE [DB]
    GO
    ALTER FULLTEXT INDEX ON [dbo].[Table] ADD ([VARBINARYMAXColumn] TYPE COLUMN [FileTypeColumn] LANGUAGE [English])
    GO
    ALTER FULLTEXT INDEX ON [dbo].[Table] ENABLE
    GO

  • And now it's working ... all I changed was the language to Neutral instead of English. No idea why that fixed things, since the majority of the documents were in English to start off with.

    **EDIT**

    Okay something weird just happened. I ran a rebuild on the index, after the above changes, and then ran the query


    SELECT * FROM
    sys.dm_fts_index_keywords_by_document
    (
        DB_ID('OSTicket'),    
        OBJECT_ID('OSTicket_TicketThreadAttachments')
    );

    Running it immediately after index rebuilding resulted in all of the correct information appearing. However, as I continued to run it, the number of rows returned started dropping quickly, until after a few seconds, it was back where it was when I posted the problem to start off with.

    It seems like for some reason the index is pruning out results after initially populating them?

    ** EDIT 2 **
    Figured out the problem - for some reason having the index on the VARBINARY column is causing it to drop all of the words obtained from the NVARCHAR(MAX) column. Investigating further...

Viewing 2 posts - 1 through 1 (of 1 total)

You must be logged in to reply to this topic. Login to reply