Quick question on Full-Text Indexes

  • Let's say I have the following table configuration:

    CREATE TABLE TableA

    (

    ID INT IDENTITY PRIMARY KEY,

    DateDeleted SMALLDATETIME

    )

    CREATE TABLE TableB

    (

    ID INT IDENTITY PRIMARY KEY,

    TableA_ID INT,

    [Text] VARCHAR(500),

    DateDeleted SMALLDATETIME

    )

    Which of the following two queries would be optimal and scalable?

    SELECT TableA.ID

    FROM TableA

    JOIN TableB ON TableA_ID = TableA.ID AND TableB.DateDeleted IS NULL AND CONTAINS(*, @Input)

    WHERE TableA.DateDeleted IS NULL

    WITH cte AS (

    SELECT TableA_ID

    FROM TableB

    WHERE TableB.DateDeleted IS NULL AND CONTAINS(*, @Input)

    )

    SELECT ID

    FROM TableA

    JOIN cte ON TableA_ID = ID

    WHERE DateDeleted IS NULL

    Basically the difference between the two is that, in the former case, I'm filtering out the deleted records from TableA on the join with TableB at the same time as I'm using the full-text search, while in the latter case, I'm filtering TableB using the full-text search first, and then removing the deleted records.

    Assume that the number of records which are not deleted in TableA and TableB will remain constant at roughly 15000 records, while the number of deleted records will grow at a rate of around 60000 records per year. Also assume that there are appropriate covering indexes on both tables, as well as the full-text index on the field [Text] in TableB.

  • It's likely that both versions will produce the same execution plan. I would probably go without the CTE because the first query is less complicated. That said, it may be possible that filtering out the NULLs within a CTE could improve performance, note this article[/url]. This is something that would be worth testing. I'd say try them both out and post the actual execution plans here.

    "I cant stress enough the importance of switching from a sequential files mindset to set-based thinking. After you make the switch, you can spend your time tuning and optimizing your queries instead of maintaining lengthy, poor-performing code."

    -- Itzik Ben-Gan 2001

  • SELECT ID

    FROM TableA

    WHERE DateDeleted IS NULL

    AND EXISTS (SELECT * FROM TableB

    WHERE TableB.DateDeleted IS NULL AND CONTAINS(*, @Input)

    AND TableB.TableA_ID = TableA.ID

    )

    _____________
    Code for TallyGenerator

  • kramaswamy (8/17/2016)


    ...Basically the difference between the two is that, in the former case, I'm filtering out the deleted records from TableA on the join with TableB at the same time as I'm using the full-text search, while in the latter case, I'm filtering TableB using the full-text search first, and then removing the deleted records. ...

    Not really. Your two queries are the same and will yield exactly the same execution plan. Furthermore, SQL Server will perform those operations in whatever order results in the lowest-costed plan.

    Ramp up your rowcounts in a test environment and test your queries and also Sergiy's. You're likely to find that Sergiy's query performs better than yours - it's a different query.

    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

Viewing 4 posts - 1 through 3 (of 3 total)

You must be logged in to reply to this topic. Login to reply