Improving Performance for Some LIKE “%string%” Searches (SQL Spackle)

  • Thanks for the info Kevin. I was unaware that filtered indexes behaved like that. I've done some research and learnt something new today.

  • This is what I wanted to say with my post.

    Thanks, Pawel for saying it.

    the caveat with filtered indexes.

    if you used a filter of say, col1 is null

    then you must state the same filter of col1 is null in your query or the filtered index will not be used.

    Well, sometimes in the query something which implies the filter will be sufficient.

    So, best, create a view which includes the filter condition and use this for querying.

  • Interesting article, but I question the test harness.

    I think you should create a more realistic table that has a sizable number of columns. There should be a significant difference between a table-scan and an index-scan which then translates into more real life performance numbers.

    You should include full query plans for all matters related to performance.

    Thanks.

  • Raja M (4/16/2014)


    Interesting article, but I question the test harness.

    I think you should create a more realistic table that has a sizable number of columns. There should be a significant difference between a table-scan and an index-scan which then translates into more real life performance numbers.

    You should include full query plans for all matters related to performance.

    Thanks.

    Raja,

    If you read the blog linked into the article, you'll see that returning more columns from the table (when they're outside of the INDEX) will break the SEEK. That's why I provided that extended information, and there is a way to go back to a SEEK even in that case.

    Your comments about having a wider table are true for testing in your particular case, but that approach doesn't lend itself well to writing articles, where it is important to keep the examples as simple as possible. It is why I recommended testing carefully when you try this approach. I believe I even pointed out that I need to do more of that before I apply this to the Production case I'm working on (still not yet done).

    And as to query plans, I posted 3 in the article. Was there one missing that you wanted to see?

    Thanks for the comments.

    Dwain.C


    My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]

    My thought question: Have you ever been told that your query runs too fast?

    My advice:
    INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
    The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

    Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
    Since random numbers are too important to be left to chance, let's generate some![/url]
    Learn to understand recursive CTEs by example.[/url]
    [url url=http://www.sqlservercentral.com/articles/St

  • Yes , the benefits of new optimizer of SQL SERVER 2014?

  • Great article. Coincidentally I did a technical test for someone this week and one of the questions was to do with searching strings using the LIKE '%text%' pattern. It asked if it was best practice or not. I didn't really know how to answer the question in a binary way so went with the old staple "It depends....<mini essay>". They must have hated me 😀

    This article is a brilliant write up of one of (many) the possible improvements to this process 🙂


    I'm on LinkedIn

  • Thanks for this post. It prompted me to revisit one of my queries that was taking over 30 seconds; it now takes 3!

  • PLEASE, could you, just in one sentence, tell the essence of what you are doing ?

  • The statement "Because we know that SQL Server ignores NULL values when it constructs an INDEX" is incorrect.

    SQL Server DOES index NULLs.

    The CTE is not required. Simply adding a NOT NULL test to the WHERE clause will cause the query to do an index scan of just the non-null values.

    SELECT AString

    FROM dbo.StringsTest

    WHERE AString IS NOT NULL AND AString LIKE '%221%';

  • I utilised this in some in house code, resolving the content of the CTE with further where cause values, (excluding the like '%...%' clause for the moment.

    I then joined the CTE to other tables, and utilised the like '%....%' after the join to other tables in a where clause.

    The results were quite favourable, almost 1/3 the cost of original query without the CTE.

    Example code:

    WITH Categories AS

    ( SELECT category_id, K_ID

    FROM [CATEGORY_TABLE] with (nolock)

    WHERE (< add where clause in here to limit content of CTE as long as it isn't the like '%...% code>)

    AND category_id IS NOT NULL)

    SELECT distinct cat.category_id

    FROM Categories cat with (nolock)

    inner join TERM_A trma with (nolock) on cat.K_ID = trma.K_ID

    WHERE

    (cat.category_id like '%*<the CTE like clause value here>%')

    AND (trma.term like '<your joined table search term here>%')

  • Thanks for the interesting article. As has been pointed out, the same improved query plan (using the index seek) can be achieved by adding the "IS NOT NULL" condition to the WHERE clause. Equivalently, I noticed that adding a condition such as "AString > '0'" will also result in the same query plan. It is essentially a meaningless condition in this context, except that it also eliminates the NULL values.

    Not that this achieves anything more, but I believe the crux of the matter is that including a sargable condition, such as "AString IS NOT NULL" or "AString > '0'," in the WHERE clause, joined to a non-sargable condition, such as "AString LIKE '%221%', with the AND operator, will result in an index seek. If you look at the properties of the index seek operator in the plan, you'll see that, in addition to the seek predicate containing the sargable condition, there will also be a predicate containing the non-sargable condition.

    If you had no NULL rows, then the following query will still have a plan with an index seek:

    SELECT AString

    FROM dbo.StringsTest

    WHERE AString IS NOT NULL AND AString LIKE '%221%';

    However, it will perform no better than an index scan, as I believe your test results showed. That is, calling it a seek doesn't mean it's any better, in that case. It's basically "seeking" every row in the table, and then checking a filter on every row, just like a scan.

  • What about converting a given column to a sparse column?

    Don Simpson



    I'm not sure about Heisenberg.

  • A word of caution regarding CTEs.

    I have run into problem with using CTE's against very large tables, especially in the 2008 line of SQL Servers, where the can CTE causes something approaching vapor lock. It gets worse with nested CTE's, in that the execution and re-execution of the queries inside CTE's must be re-executed every time they are referenced. The symptom when seen with SP_WHO2 is the query fails to progress for many minutes, even hours. It consumes CPU, and blows out to a large number of parallelized executions.

    Refactoring the same query to replace the CTE with a derived table or dumping the CTE results to a #temp table first, restore the query to sub-minute performance.

    It seems that there may have been a defect in 2008R2 that was resolved in 2012. In other words, we could readily reproduce in 2008R2 but not in 2012.

    We've removed all use of CTE from our code, except for recursive queries. Even there, we dump the results to a #temp table and use the #temp thereafter.

  • Hi Dwain,

    Great article, as usual. It spawned some pretty good discussion, as well. Well done and thanks for taking the time to write something like this.

    I did find that, as someone else mentioned on this thread, that adding a sargable lookup does the same thing as the CTE does. If you take that one step further, you can improve the query and also ignore blanks as well as NULLs using the following. Of course and as always, YMMV and "It Depends". 😀

    SELECT AString

    FROM dbo.StringsTest

    WHERE AString > ''

    AND AString LIKE '%221%'

    ;

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • h.tobisch (1/16/2015)


    PLEASE, could you, just in one sentence, tell the essence of what you are doing ?

    Short circuit a query using a SARGable predicate to take advantage of the index that leading wildcard lookups will not.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 15 posts - 16 through 30 (of 38 total)

You must be logged in to reply to this topic. Login to reply