• quickdraw (2/28/2011)


    #1 This will be faster than full text search for large loads (hundreds of queries per minute) on a db with millions of records/documents to search.

    ...

    Full text search was too slow for a high load environment, and I'd bet money on the fact that SQL Server is the same. You just can't push hundreds of full text searches per minute with a high number of documents. The method here will be faster.

    ------------------------------------------------

    #2 This method doesn't allow wildcards

    We dynamically built a SQL query for each keyword. This allows for the wildcards, and for special processing to allow wildcard suffixes. (See #3)

    ------------------------------------------------

    #3 By reversing the kewords in the keywords table, you get a new feature no database has (without a full table scan)

    One additional thing I did is to reverse all the keywords in the keywords table. Since Wildcard LIKE searches were allowed on the keywords table (like in full text search), having each keyword spelled backwards allowed a wonderful feature:

    Search keywords: "hold her %ly"

    [search for cheesy love novels: "hold her gently", "hold her gingerly", "hold her tightly"]

    re: #1: We've been using FTS in SS2005 and SS2008 on moderate sized tables (about 4M items) with high concurrency and it performs very well. The key (for us) was to isolate the search results from the rest of query. For example:

    SELECT ... FROM A

    JOIN (SELECT key FROM A WHERE CONTAINS( search )) as S on S.key = A.key

    re: #2: The search predicate syntax is a bit involved so we normalize the user supplied text before using it as a search expression. For example, 'IRON MAN' becomes '("IRON*" and "MAN*") or "IRONMAN*"' and that meets most of business needs. We also strip out any punctuation and support special prefixes that the user app can send to modify the search behavior (e.g.: '>IRON MAN' becomes "IRON MAN*" which looks for an item with a word starting with IRON and any following word starting with MAN). We also let "power users" enter raw search expression.

    re: #3: Our full text indexes are based on indexed views which merge several columns together and also clean up the indexed text.

    CREATE VIEW IndexedConsultant WITH SCHEMA_BINDING

    as SELECT *, replace(firstName+ ' '+ lastName+ ' '+ roleTitle+ ' '+ contractReviews + placementNotes,'+',' ')) as combinedText

    FROM Consultant ...

    This improved search behavior and performance for our app since we only needed to search one column and could locate items where the matches are not necessarily from a single column. In the following example any consultant's named Jones who are Engineers will be found along with any consultants who might have been placed at Jones Engineering.

    SELECT as consultantId FROM CONTAINSTABLE(IndexedConsultant,dbo.fNomalizeSearch('Jones Engineer'))

    You could add your reversing logic to the column definition of the indexed text.

    All of our full text searches are handled by UDFs with no dynamic SQL required. The results can then be joined back to their base tables and the logic easily called by user app, included in stored procedures, or included in more complex searches. So, I suggest you should take another look at the built-in FTS capabilities because it would be difficult to match it's flexibility, speed, and code-clarity.