Optimize a LIKE '%text%' query

Question

Optimize a LIKE '%text%' query

ralu_k_17

SSCrazy

Points: 2190
More actions
November 9, 2018 at 6:52 am

#382273

Hello!
Is there a way of optimizing(/workaround) the performance of a LIKE '%text%' query, on a non-unique column (so, no full text index possible)? Both wildcards are required.
Thank you for any input!

Viewing 11 posts - 1 through 10 (of 10 total)

You must be logged in to reply to this topic. Login to reply

Y.B. SSChampion Points: 11642 More actions · Answer 1

Well the leading wildcard in that LIKE clause will guarantee it will have to do full scans. Something LIKE 'text%' is fine though. I guess it comes down to your requirements.

SELECT quote FROM brain WHERE original = 1
0 rows returned

ralu_k_17 SSCrazy Points: 2190 More actions · Answer 2

Unfortunately, both wildcards are required. I will update my initial post with this info, as well.

Jonathan AC Roberts SSCoach Points: 18257 More actions · Answer 3

Alternatives are:
select * from myTable x where x.Description like '%text%' select * from myTable x where PATINDEX('%text%',x.Description)>0 select * from myTable x where CHARINDEX('text',x.Description)>0

I doubt there is much difference between them though you would think that CHARINDEX has less logic in so it should be faster, but I don't think that's the case.

ralu_k_17 SSCrazy Points: 2190 More actions · Answer 4

Thank you for your reply!
Interestingly enough, if I run the second option (PATINDEX) I see a 3sec improvement, which is huge, because the initial run was 4-5sec and this is about 1-2sec.
The problem is, if I put just this query (PATINDEX) , with absolutely no other statements, into a sp and send the %text% as a parameter to the sp, it goes back to 5sec

Lynn Pettis SSC Guru Points: 442467 More actions · Answer 5

Capture and compare the actual execution plans between the various options.

Jason A. Long SSC-Insane Points: 23780 More actions · Answer 6

ralu_k_17 - Friday, November 9, 2018 6:52 AM
Hello!
Is there a way of optimizing(/workaround) the performance of a LIKE '%text%' query, on a non-unique column (so, no full text index possible)? Both wildcards are required.
Thank you for any input!

It depends... There is a method that works but it requires that you maintain an "N-grams" table of your values.

Check out the following two articles...
http://www.sqlservercentral.com/articles/Tally+Table/142316/
https://sqlperformance.com/2017/02/sql-indexes/seek-leading-wildcard-sql-server

ScottPletcher SSC Guru Points: 100946 More actions · Answer 7

Create a nonclustered index using the clustering column(s) as key(s) and INCLUDE the search column. That's about the best you can do with this type of requirement.

SQL DBA,SQL Server MVP(07, 08, 09) A socialist is someone who will give you the shirt off *someone else's* back.

Eric M Russell SSC Guru Points: 125520 More actions · Answer 8

I don't know your database, but it could be argued that if users must routinely perform LIKE '%keyword%' queries on text columns just to locate the rows they are looking for, then the data model is not appropriately normalized. For example, if users are querying WHERE CustomerName LIKE '%Smith%', then there should definitely instead have FirstName and LastName columns. Or, if they are querying WHERE ProductName LIKE '%laptop%' then that suggests a ProductCategoryID column and ProductCategory table should be introduced. So, the best solution depends on the context of why they are needing to perform this text searching.

As someone suggested earlier, one approach is to parse the contents of the text name/description column into a keyword lookup table similar to the example below. Another benefit to implementing a search keyword lookup table is that the Keywords table can be extended to contain other attributes like a count of how times the keyword was searched and additional columns for categorization and semantic type searching. For example, all of the laptop, TV, phone, etc. related keywords could be assigned to a category called 'electronics'.

CREATE TABLE Products ( PRIMARY KEY CLUSTERED ( ProductID ) , ProductID INT NOT NULL , ProductName VARCHAR(500) NOT NULL );

CREATE TABLE ProductKeywords ( PRIMARY KEY CLUSTERED ( KeywordID, ProductID ) , KeywordID INT NOT NULL , ProductID INT NOT NULL );

CREATE TABLE Keywords ( PRIMARY KEY CLUSTERED ( Keyword ) , KeywordName VARCHAR(200) NOT NULL , KeywordID INT NOT NULL , CategoryID INT NOT NULL );

-- Query Products by keyword: SELECT P.ProductID, P.ProductName FROM Keywords AS K JOIN ProductKeywords AS PK ON PK.KeywordID = K.KeywordID JOIN Products AS P ON P.ProductID = PK.ProductID WHERE K.KeywordName = 'laptop';

-- Query Products by category: SELECT P.ProductID, P.ProductName FROM Keywords AS K JOIN ProductKeywords AS PK ON PK.KeywordID = K.KeywordID JOIN Products AS P ON P.ProductID = PK.ProductID WHERE K.CategoryID = 12;

"Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho

Jeff Moden SSC Guru Points: 1003857 More actions · Answer 9

ralu_k_17 - Friday, November 9, 2018 6:52 AM
Hello!
Is there a way of optimizing(/workaround) the performance of a LIKE '%text%' query, on a non-unique column (so, no full text index possible)? Both wildcards are required.
Thank you for any input!

1. Does the table have a PK?
2. How wide is the text data and can you provide an example so we can see what you're actually having to work with?

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Sergiy SSC Guru Points: 110206 More actions · Answer 10

ralu_k_17 - Friday, November 9, 2018 6:52 AM
Hello!
Is there a way of optimizing(/workaround) the performance of a LIKE '%text%' query, on a non-unique column (so, no full text index possible)? Both wildcards are required.
Thank you for any input!

You may start from normalizing the data.

_____________
Code for TallyGenerator