Searching Words and Phrases in a Rich Text Field

  • Comments posted here are about the content posted at http://www.sqlservercentral.com/columnists/yEkhtiari/3054.asp

  • The author needs to get his facts straight and know the different capabilities of SQL Server 2000 and 2005. It sounds like the author has no experience of SQL Server 2005.

    The full-text catalogs and indexes are not stored in a SQL server databases,

    Correct

    which means if you restore the database on another server you have to do a lot of effort to build that index again.

    Incorrect.

    SQL Server 2000 and 2005 differ here. I believe 2005 backups the FTS data. It is very easy to rebuild a full-text index in either version!

    Also you have to set up a job to maintain the changes to full-text index that means your index is not up to date

    Not true.

    You don't have to have a job. You can do it manually, scheduled or automatically. I use the automatic method on a very large database without issue whatsoever. It is very quick to update the index often in milliseconds.

    more over you have to start a service called Microsoft Full-Text Engine (MSFTE SQL).

    And the problem with this is? That's right there is no problem!

  • SQL Server 2000 stores the full-Text index outside the database.  SQL Server 2005 stores the full-text index inside the database.  This means that when a SQL Server 2005 database is restored, the full-text index is also restored, with no rebuilding required.

    The speed of building the full-text index was also improved.  A full-text index on SQL Server 2000 was taking up to 2 hours to build on a production server.  I restored the database to a laptop running SQL Server 2005 (upgrading the database in the process), and the full-text index built in around 6 minutes.

    The use of the full-text index resulted in faster calls to the database.  Without the full-text index, the response times were 10 - 100 times greater (your results may vary).

    I agree with the author's suggestion that there is extra overhead and learning involved with a full-text index, but the results can make it worthwhile.

  • The additional learning / overhead is definitely worth while. What if you wanted to use, Relevancy, Thesaurus, Inflections, Weightings, Boolean Logic, Wildcards, etc. These are all possible with FTS. Having spoken to Microsoft SQL / FTS Program Managers I understand they are considering building (or have already) FTS into the actual SQL Server 2008 database engine itself rather than having it as a separate service. This should increase index speeds further and will certainly increase the performance of any query which contains FTS logic. FTS is the only smart way to do this sort of thing.

  • SQL Server 2000 stores the full-Text index outside the database.  SQL Server 2005 stores the full-text index inside the database.  This means that when a SQL Server 2005 database is restored, the full-text index is also restored, with no rebuilding required.

    Actually the Full-Text Catalog is still stored in the file system.  http://msdn2.microsoft.com/en-us/library/ms142541.aspx

    There have been several improvements to SQL 2005 FTS, especially performance improvements to the Gatherer.  As mentioned, you can now back up your full-text catalogs with the database in 2005.

  • SQL Server 2005 documentation on the "CREATE FULL-TEXT CATALOG" statement shows the following:

    The default full-text filegroup is the primary filegroup for the database. We recommend that full-text catalogs be put on a secondary filegroup. This lets you do filegroup backup-and-restore operations of full-text catalogs independent of the data and log files. Additionally, by putting the full-text catalogs in a secondary filegroup, you can restore an online full-text catalog.

    Internal system tables are also used for storing full-text index information (from http://msdn2.microsoft.com/en-us/library/ms142541.aspx):

    Internal Tables

    Obviously, Microsoft will not tell us exactly how full-text catalog/indexes are created and used, but the above quotes show that SQL Server 2005 integrates full-text catalogs/indexes into the database better than SQL Server 2000.

    However, the exact integration of full-text catalogs/indexes is not important.  The authors code mimics full-text functionality, which is available when you install SQL Server.  Personally, I don't want to "re-invent the wheel" unless I have something to gain (performance, ease of maintenance, etc.).

  • There are actually a lot of very good articles available via MSDN and other sources that explain exactly how Full-Text Catalog information is gathered, processed, stored, and queried.  Unfortunately most articles give this info out in small bite-sized pieces, meaning you have to scour the Internet to piece together bits and pieces from all over.  SQL Server MVP Hilary Cotter is a great person to Google if you want specific information about how SQL FTS works.  He knows more about SQL FTS than probably anyone outside of Redmond, and he frequently writes and contributes to FTS articles and books.

    The internal tables you reference store some metadata about the full-text indexes/catalogs (basically ID number mappings/references) but they are not 'the' full-text indexes/catalogs:  http://msdn2.microsoft.com/en-us/library/ms366343.aspx

    I agree with your sentiment that reinventing the wheel without some additional gain is not worth the effort, except maybe in a purely academic setting where your grades depend on how well you reinvent the wheel.

Viewing 7 posts - 1 through 6 (of 6 total)

You must be logged in to reply to this topic. Login to reply