Click here to monitor SSC
SQLServerCentral is supported by Redgate
 
Log in  ::  Register  ::  Not logged in
 
 
 


Optimising “Ends With” searches with REVERSE


Optimising “Ends With” searches with REVERSE

Author
Message
Jeff Moden
Jeff Moden
SSC-Forever
SSC-Forever (44K reputation)SSC-Forever (44K reputation)SSC-Forever (44K reputation)SSC-Forever (44K reputation)SSC-Forever (44K reputation)SSC-Forever (44K reputation)SSC-Forever (44K reputation)SSC-Forever (44K reputation)

Group: General Forum Members
Points: 44956 Visits: 39862
Jeff Moden (1/13/2010)
Don't get me started on what a poor metric the number of 'logical reads is'


Heh... now you know why I couched it all in "It Depends". Hehe Between that and the crud that shows up in the execution plan....

--Jeff Moden

RBAR is pronounced ree-bar and is a Modenism for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
     Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column.
Although they tell us that they want it real bad, our primary goal is to ensure that we dont actually give it to them that way.
Although change is inevitable, change for the better is usually not.
Just because you can do something in PowerShell, doesnt mean you should. Wink

Helpful Links:
How to post code problems
How to post performance problems
Forum FAQs
Jeff Moden
Jeff Moden
SSC-Forever
SSC-Forever (44K reputation)SSC-Forever (44K reputation)SSC-Forever (44K reputation)SSC-Forever (44K reputation)SSC-Forever (44K reputation)SSC-Forever (44K reputation)SSC-Forever (44K reputation)SSC-Forever (44K reputation)

Group: General Forum Members
Points: 44956 Visits: 39862
peter-757102 (1/13/2010)
Note that indexes can only have a limited width (300 bytes or so , I believe).

Because of this, it makes sense to do a reverse of only the last N characters and limit the length of the search text to match. If your strings have varying lengths between 3 and 250 characters and your search strings are strict enough with 12 characters, add a few extra and use the reverse of the rightmost 16 characters to put the index on and truncate the search string to 16 characters as well.

This saves you a ton of otherwise unneeded duplicate data and keeps the number of page reads to a minimum.



It's actually 900 bytes for an index but very good point especially if you need to do it often.

--Jeff Moden

RBAR is pronounced ree-bar and is a Modenism for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
     Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column.
Although they tell us that they want it real bad, our primary goal is to ensure that we dont actually give it to them that way.
Although change is inevitable, change for the better is usually not.
Just because you can do something in PowerShell, doesnt mean you should. Wink

Helpful Links:
How to post code problems
How to post performance problems
Forum FAQs
Jeff Moden
Jeff Moden
SSC-Forever
SSC-Forever (44K reputation)SSC-Forever (44K reputation)SSC-Forever (44K reputation)SSC-Forever (44K reputation)SSC-Forever (44K reputation)SSC-Forever (44K reputation)SSC-Forever (44K reputation)SSC-Forever (44K reputation)

Group: General Forum Members
Points: 44956 Visits: 39862
Rome1981 (1/13/2010)
I was just about to post the same. Good to know there are people testing these solutions/examples.

Jeff- That is very good advice, I did not know that.


Thanks for the feedback. They key is that you have to look, though. Like Paul says, "Logical Reads" isn't the best and most reliable parametric and execution plans can lie like a rug.

--Jeff Moden

RBAR is pronounced ree-bar and is a Modenism for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
     Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column.
Although they tell us that they want it real bad, our primary goal is to ensure that we dont actually give it to them that way.
Although change is inevitable, change for the better is usually not.
Just because you can do something in PowerShell, doesnt mean you should. Wink

Helpful Links:
How to post code problems
How to post performance problems
Forum FAQs
Peter Brase
Peter Brase
Valued Member
Valued Member (63 reputation)Valued Member (63 reputation)Valued Member (63 reputation)Valued Member (63 reputation)Valued Member (63 reputation)Valued Member (63 reputation)Valued Member (63 reputation)Valued Member (63 reputation)

Group: General Forum Members
Points: 63 Visits: 66
Thanks for this, an excellent tip! I discovered that even if you skip the extra column and just run the reverse query, i.e., "where reverse(fieldname) like 'cba%'" instead of "where fieldname like '%abc'" the response time improves by 50%.
JJ B
JJ B
Old Hand
Old Hand (327 reputation)Old Hand (327 reputation)Old Hand (327 reputation)Old Hand (327 reputation)Old Hand (327 reputation)Old Hand (327 reputation)Old Hand (327 reputation)Old Hand (327 reputation)

Group: General Forum Members
Points: 327 Visits: 2846
What great timing for this article!

I'm just finishing up the design for an application that involves a "call module". This module is a place for users to record various data about phone calls that come into our agency. One of the features in the application is that the users need to be able to look up (search) records by the caller's phone number.

For various reasons, when entering the phone number data, users will have to enter the full phone number, including an area code. Ex: (541)444-5555.

However, the VAST majority of phone numbers that our users will enter or see will have the same area code. So, I've been playing with the idea of adding a feature where staff only have to enter the last 7 digits when doing a search.

I've been reluctant to implement such a feature for a couple reasons, one of which is precisely because I didn't want to have a WHERE clause such as:
WHERE phone like '%444-5555'
Yuck.

This article has given me a technique that makes that feature more palatable. I understand Jeff's caution. Even so, I'm excited about this idea because I think it might be perfect for the situation in my application. Thanks.

(The reason I explained my situation in detail is because I thought it might be helpful to provide a real-world example of when this need might come into play. If I didn't have such a need myself right now, I would not have been able to think of a realistic example of why anyone would really want to provide a "search the end" feature for users.)
john.arnott
john.arnott
UDP Broadcaster
UDP Broadcaster (1.5K reputation)UDP Broadcaster (1.5K reputation)UDP Broadcaster (1.5K reputation)UDP Broadcaster (1.5K reputation)UDP Broadcaster (1.5K reputation)UDP Broadcaster (1.5K reputation)UDP Broadcaster (1.5K reputation)UDP Broadcaster (1.5K reputation)

Group: General Forum Members
Points: 1466 Visits: 3059
Bravo! Excellent example of explaining a technique that, once understood, seems so very natural. And, once again the discussion adds tremendously by making it clear that this is a "pay me now or pay me later" situation (absorb the overhead of index maintenance in favor of faster queries) and by suggesting improvements like Peter's (make the computed column only the last N characters of the search column).

Articles and discussions like this have made SSC my favorite web site (edging out ESPN.com).
=============
edit: fix minor typo
johnzabroski
johnzabroski
Valued Member
Valued Member (55 reputation)Valued Member (55 reputation)Valued Member (55 reputation)Valued Member (55 reputation)Valued Member (55 reputation)Valued Member (55 reputation)Valued Member (55 reputation)Valued Member (55 reputation)

Group: General Forum Members
Points: 55 Visits: 191
You can circumvent the max index key size

http://blogs.msdn.com/anthonybloesch/archive/2009/11/23/implementing-uniqueness-constraints-on-large-columns.aspx
SQLRNNR
SQLRNNR
SSC-Insane
SSC-Insane (21K reputation)SSC-Insane (21K reputation)SSC-Insane (21K reputation)SSC-Insane (21K reputation)SSC-Insane (21K reputation)SSC-Insane (21K reputation)SSC-Insane (21K reputation)SSC-Insane (21K reputation)

Group: General Forum Members
Points: 21055 Visits: 18258
Jeff, Paul and others have made some excellent comments. I think this was a nice article. It certainly opens up avenues of thinking for any who might need to refine similar situations.



Jason AKA CirqueDeSQLeil
I have given a name to my pain...
MCM SQL Server, MVP


SQL RNNR

Posting Performance Based Questions - Gail Shaw

ben-564110
ben-564110
SSC-Addicted
SSC-Addicted (492 reputation)SSC-Addicted (492 reputation)SSC-Addicted (492 reputation)SSC-Addicted (492 reputation)SSC-Addicted (492 reputation)SSC-Addicted (492 reputation)SSC-Addicted (492 reputation)SSC-Addicted (492 reputation)

Group: General Forum Members
Points: 492 Visits: 379
Peter Brase (1/13/2010)
Thanks for this, an excellent tip! I discovered that even if you skip the extra column and just run the reverse query, i.e., "where reverse(fieldname) like 'cba%'" instead of "where fieldname like '%abc'" the response time improves by 50%.

Hi Peter,

Depending on how you measured that response time I'd be a little bit wary of that test. It might be simply that the first time you ran the query, the pages returned were kept in the SQL cache, and then when the second query was run (which would return exactly the same rows/pages) the data came straight from the cache and was therefore quicker.

Regards,
Ben
Calvin Lawson
Calvin Lawson
Mr or Mrs. 500
Mr or Mrs. 500 (582 reputation)Mr or Mrs. 500 (582 reputation)Mr or Mrs. 500 (582 reputation)Mr or Mrs. 500 (582 reputation)Mr or Mrs. 500 (582 reputation)Mr or Mrs. 500 (582 reputation)Mr or Mrs. 500 (582 reputation)Mr or Mrs. 500 (582 reputation)

Group: General Forum Members
Points: 582 Visits: 102
Thanks Ben, great idea for when you absolutely have to have those fast lookups with wildcards at the beginning of the search term. Which happens more often then one might think...

One small point; for the sake of code clarity, I would do something like this:

SELECT NAME FROM TEST_TABLE WHERE NAME_REVERSED LIKE REVERSE('%ID')

Signature is NULL
Go


Permissions

You can't post new topics.
You can't post topic replies.
You can't post new polls.
You can't post replies to polls.
You can't edit your own topics.
You can't delete your own topics.
You can't edit other topics.
You can't delete other topics.
You can't edit your own posts.
You can't edit other posts.
You can't delete your own posts.
You can't delete other posts.
You can't post events.
You can't edit your own events.
You can't edit other events.
You can't delete your own events.
You can't delete other events.
You can't send private messages.
You can't send emails.
You can read topics.
You can't vote in polls.
You can't upload attachments.
You can download attachments.
You can't post HTML code.
You can't edit HTML code.
You can't post IFCode.
You can't post JavaScript.
You can post emoticons.
You can't post or upload images.

Select a forum

































































































































































SQLServerCentral


Search