Tips for Full-Text Indexing/Catalog Population/Querying in SQL 7.0 and 2000
This article is a brief summary of several tips & tricks I have learned through working with the Full-Text features in SQL Server.
Invalid Catalog Path:
After installing SQL Server 2000 on a new machine and re-attaching the databases from tape back-up, I experienced some difficulties in getting the Full-Text Indexing to work properly.
I received an error referencing an invalid location for the Full-Text catalog from within Enterprise Manager when I tried to run a full population of the catalog. The stored procedures for disabling and removing Full-Text indexes and catalogs did not resolve the issue. In poking around, I found the problem stemmed from the fact that on the previous machine, the catalog was located on the F: drive. On the new machine, there was no F: drive.
The Full-Text Indexing wizard in SQL 2000 does not allow the user to alter the location of an existing catalog. It only lets the user create a new catalog. I tried to create a new catalog in a new location as a work around, but because SQL could not resolve the erroneous location of the previous catalog, I could not complete the wizard.
To fix the problem, I changed the SQL Server behavior to allow modifications to the system catalogs. I looked in the sysFullTextCatalogs table in the current database and changed the 'path' field value to the new location. (If the value is NULL, it means a path was not given at setup and the default installation path was used.) This allowed me to modify the Full-Text Indexing on the new machine. (Remember to change the server behavior back to its original setting.)
Incremental Population Discrepencies:
Incremental Full-Text Index population in SQL 7.0 and 2000 exhibit different behaviors. The behavior of SQL 7.0's Full-Text catalog population is documented, but sometimes hard to find, so I wanted to discuss it here.
On my SQL 7.0 machine, I scheduled an incremental population. When I checked in to see if it had run, it had not. In SQL 2000, I can schedule an incremental population and not have to worry about first running a full population. SQL 2000 knows if it is the first population or if it is a subsequent population. SQL 7.0 does not have this feature. If you schedule an incremental population without running a full population first, the incremental population will not run. And when you run your full-text query, it will return nothing. (Helpful hint: Make sure you have a field of type TimeStamp in your table. Without one, you cannot properly run an incremental population.)
Full-Text Querying Discrepencies:
There are some differences between SQL 7.0 and 2000 in their Full-Text Querying capabilites. SQL 7.0 has limitations in the number of 'CONTAINS' clauses it can process at any one time (in my testing, it is around 16). I have not been able to find any specific documentation on this issue, but SQL 2000 does not seem to have this limit. Below is a brief reference from Microsoft on this issue: If you are using multiple CONTAINS or FREETEXT predicates in your SQL query and are experiencing poor full-text search query performance, reduce the number of CONTAINS or FREETEXT predicates or using "*" to use all full-text indexed columns in your query.
In SQL 7.0, if there are excessive grouping parenthesis (even though they match up), the query will hang. Even when the command timeout property is set, the query will hang past the command timeout value assigned, and you will receive an error message of 'Connection Failure -2147467259'. When the extra parenthesis are removed, the query executes fine. In SQL 2000 the original query runs with no problems.
When a Full-Text query in SQL 7.0 contained a single noise word, I would receive the error 'Query contained only ignored words'. SQL 2000, handled the noise words and returned the query results. In SQL 7.0, I had to remove all noise words from the query for it to run successfully. Here is a recommendation from Microsoft pertaining to this issue: You also may encounter Error 7619, "The query contained only ignored words" when using any of the full-text predicates in a full-text query, such as CONTAINS(pr_info, 'between AND king'). The word "between" is an ignored or noise word and the full-text query parser considers this an error, even with an OR clause. Consider rewriting this query to a phrase-based query, removing the noise word, or options offered in Knowledge Base article Q246800, "INF: Correctly Parsing Quotation Marks in FTS Queries". Also, consider using Windows 2000 Server: there have been some enhancements to the word-breaker files for Indexing Services.
(noise.enu - located in ...\FTDATA\SQLServer\Config\) about 1 after 2 all also 3 an 4 and 5 another 6 any 7 are 8 as 9 at 0 be $ because been before being between both but by came can come could did do does each else for from get got has had he have her here him himself his how if in into is it its just like make many me might more most much must my never now of on only or other our out over re said same see should since so some still such take than that the their them then there these they this those through to too under up use very want was way we well were what when where which while who will with would you your a b c d e f g h i j k l m n o p q r s t u v w x y z