Click here to monitor SSC
SQLServerCentral is supported by Redgate
 
Log in  ::  Register  ::  Not logged in
 
 
 


Optimising “Ends With” searches with REVERSE


Optimising “Ends With” searches with REVERSE

Author
Message
ben-564110
ben-564110
SSC-Addicted
SSC-Addicted (492 reputation)SSC-Addicted (492 reputation)SSC-Addicted (492 reputation)SSC-Addicted (492 reputation)SSC-Addicted (492 reputation)SSC-Addicted (492 reputation)SSC-Addicted (492 reputation)SSC-Addicted (492 reputation)

Group: General Forum Members
Points: 492 Visits: 379
Comments posted to this topic are about the item Optimising “Ends With” searches with REVERSE
Varun.c
Varun.c
Forum Newbie
Forum Newbie (3 reputation)Forum Newbie (3 reputation)Forum Newbie (3 reputation)Forum Newbie (3 reputation)Forum Newbie (3 reputation)Forum Newbie (3 reputation)Forum Newbie (3 reputation)Forum Newbie (3 reputation)

Group: General Forum Members
Points: 3 Visits: 40
Nice Article.

I think you need to correct this query
from
SELECT NAME FROM TEST_TABLE WHERE NAME LIKE 'DI%'
To
SELECT NAME FROM TEST_TABLE WHERE NAME_REVERSED LIKE 'DI%'


Best,
Varun.C
Jeff Moden
Jeff Moden
SSC Guru
SSC Guru (51K reputation)SSC Guru (51K reputation)SSC Guru (51K reputation)SSC Guru (51K reputation)SSC Guru (51K reputation)SSC Guru (51K reputation)SSC Guru (51K reputation)SSC Guru (51K reputation)

Group: General Forum Members
Points: 51833 Visits: 40308
Good article and makes a lot of sense... providing that the table doesn't have a lot of inserts where every index counts AGAINST inserts. Most people don't know that INSERTS on heavily indexed tables are one of the primary causes of very high reads. For example and using the given example table, if I run the following code...

--drop table test_table
SELECT object_id, name, system_type_id INTO test_table FROM master.sys.all_parameters

CREATE NONCLUSTERED INDEX ix_name ON test_table (name ASC)

ALTER TABLE test_table ADD name_REVERSED AS REVERSE(name)


PRINT '========== Insert with no index =========='
SET STATISTICS IO ON
SET STATISTICS TIME ON
INSERT INTO test_table SELECT 1,'Dodah',2
SET STATISTICS TIME OFF
SET STATISTICS IO OFF
GO
CREATE NONCLUSTERED INDEX IX_REVERSED ON TEST_TABLE (NAME_REVERSED ASC) INCLUDE (NAME)
GO
PRINT '========== Insert with index =========='
SET STATISTICS IO ON
SET STATISTICS TIME ON
INSERT INTO test_table SELECT 1,'Dodah',2
SET STATISTICS TIME OFF
SET STATISTICS IO OFF



... then we can see that a single insert after the index is created now takes 11 reads instead of just 3.


(6756 row(s) affected)
========== Insert with no index ==========
SQL Server parse and compile time:
CPU time = 0 ms, elapsed time = 0 ms.
SQL Server parse and compile time:
CPU time = 0 ms, elapsed time = 0 ms.
Table 'test_table'. Scan count 0, logical reads 3, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 1 ms.

(1 row(s) affected)
========== Insert with index ==========
SQL Server parse and compile time:
CPU time = 0 ms, elapsed time = 0 ms.
SQL Server parse and compile time:
CPU time = 0 ms, elapsed time = 0 ms.
SQL Server parse and compile time:
CPU time = 0 ms, elapsed time = 0 ms.
Table 'test_table'. Scan count 0, logical reads 11, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 1 ms.

(1 row(s) affected)

If you have an IO bound system, you need to be really careful about adding any indexes to tables that have a high insertion rate. Like everything else, "It Depends" and only a bit of "complete" testing will show you things that you may have not considered.

--Jeff Moden

RBAR is pronounced ree-bar and is a Modenism for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column.
Although they tell us that they want it real bad, our primary goal is to ensure that we dont actually give it to them that way.
Although change is inevitable, change for the better is not.
Just because you can do something in PowerShell, doesnt mean you should. Wink

Helpful Links:
How to post code problems
How to post performance problems
Forum FAQs
Paul White
Paul White
SSChampion
SSChampion (11K reputation)SSChampion (11K reputation)SSChampion (11K reputation)SSChampion (11K reputation)SSChampion (11K reputation)SSChampion (11K reputation)SSChampion (11K reputation)SSChampion (11K reputation)

Group: General Forum Members
Points: 11132 Visits: 11353
Jeff Moden (1/13/2010)
Good article and makes a lot of sense... providing that the table doesn't have a lot of inserts where every index counts AGAINST inserts. Most people don't know that INSERTS on heavily indexed tables are one of the primary causes of very high reads. For example and using the given example table, if I run the following code...

... then we can see that a single insert after the index is created now takes 11 reads instead of just 3.

On my 2008 system, the logical reads change from 3 to 5 (there are 7090 rows).

Jeff Moden (1/13/2010)
If you have an IO bound system, you need to be really careful about adding any indexes to tables that have a high insertion rate. Like everything else, "It Depends" and only a bit of "complete" testing will show you things that you may have not considered.

It is important to consider the impact of index maintenance, for sure, but there's nothing unusual about that. Every new index has to justify itself in terms of the trade off between increased maintenance overhead versus the benefits produced by having the index. Generally, one would expect most useful indexes to involve a net saving in reads; the reduced reads coming from queries that benefit from the index should outweigh the reads added by index maintenance.

Don't get me started on what a poor metric the number of 'logical reads is' ;-)

All that aside, I found this to be a good article, and a fine approach for any system that finds itself requiring a lot of 'ends with' searches. It is also well presented and clear - so top marks for style.

Paul



Paul White
SQLPerformance.com
SQLblog.com
@SQL_Kiwi
ben-564110
ben-564110
SSC-Addicted
SSC-Addicted (492 reputation)SSC-Addicted (492 reputation)SSC-Addicted (492 reputation)SSC-Addicted (492 reputation)SSC-Addicted (492 reputation)SSC-Addicted (492 reputation)SSC-Addicted (492 reputation)SSC-Addicted (492 reputation)

Group: General Forum Members
Points: 492 Visits: 379
Hi Varun.C,

Well spotted!

I have made the correction.

Many thanks,
Ben
Rome1981
Rome1981
Valued Member
Valued Member (63 reputation)Valued Member (63 reputation)Valued Member (63 reputation)Valued Member (63 reputation)Valued Member (63 reputation)Valued Member (63 reputation)Valued Member (63 reputation)Valued Member (63 reputation)

Group: General Forum Members
Points: 63 Visits: 106
I was just about to post the same. Good to know there are people testing these solutions/examples.

Jeff- That is very good advice, I did not know that.
Tab Alleman
Tab Alleman
Old Hand
Old Hand (321 reputation)Old Hand (321 reputation)Old Hand (321 reputation)Old Hand (321 reputation)Old Hand (321 reputation)Old Hand (321 reputation)Old Hand (321 reputation)Old Hand (321 reputation)

Group: General Forum Members
Points: 321 Visits: 115
When I saw the title of this article, I immediately envisioned replacing:

...WHERE Name LIKE '%son'

with:

WHERE LEFT(REVERSE(Name),3)='nos'

Because to answer the introductory question, "How often do I need to do a search like...?": Not nearly often enough to justify creating a new indexed column solely for that purpose. A pure-SQL solution would be interesting though.



peter-757102
peter-757102
Mr or Mrs. 500
Mr or Mrs. 500 (529 reputation)Mr or Mrs. 500 (529 reputation)Mr or Mrs. 500 (529 reputation)Mr or Mrs. 500 (529 reputation)Mr or Mrs. 500 (529 reputation)Mr or Mrs. 500 (529 reputation)Mr or Mrs. 500 (529 reputation)Mr or Mrs. 500 (529 reputation)

Group: General Forum Members
Points: 529 Visits: 2552
Note that indexes can only have a limited width (300 bytes or so , I believe).

Because of this, it makes sense to do a reverse of only the last N characters and limit the length of the search text to match. If your strings have varying lengths between 3 and 250 characters and your search strings are strict enough with 12 characters, add a few extra and use the reverse of the rightmost 16 characters to put the index on and truncate the search string to 16 characters as well.

This saves you a ton of otherwise unneeded duplicate data and keeps the number of page reads to a minimum.
alen teplitsky
alen teplitsky
SSCommitted
SSCommitted (1.8K reputation)SSCommitted (1.8K reputation)SSCommitted (1.8K reputation)SSCommitted (1.8K reputation)SSCommitted (1.8K reputation)SSCommitted (1.8K reputation)SSCommitted (1.8K reputation)SSCommitted (1.8K reputation)

Group: General Forum Members
Points: 1803 Visits: 4626
this is great and would be a great replacement for Full Text Indexing for smaller text columns. my biggest gripe about FTI is the blocking caused by inserting large amounts of data if you have change tracking set to auto
ben-564110
ben-564110
SSC-Addicted
SSC-Addicted (492 reputation)SSC-Addicted (492 reputation)SSC-Addicted (492 reputation)SSC-Addicted (492 reputation)SSC-Addicted (492 reputation)SSC-Addicted (492 reputation)SSC-Addicted (492 reputation)SSC-Addicted (492 reputation)

Group: General Forum Members
Points: 492 Visits: 379
peter-757102 (1/13/2010)
Note that indexes can only have a limited width (300 bytes or so , I believe).

Because of this, it makes sense to do a reverse of only the last N characters and limit the length of the search text to match. If your strings have varying lengths between 3 and 250 characters and your search strings are strict enough with 12 characters, add a few extra and use the reverse of the rightmost 16 characters to put the index on and truncate the search string to 16 characters as well.

This saves you a ton of otherwise unneeded duplicate data and keeps the number of page reads to a minimum.


That's a great idea and a good refinement. I suppose you could also search on the last n characters which would potentially return you a list of primary keys from the table and then do a normal 'forwards' search on the resulting rows to narrow those down to the specifc results you want. The max index key size is 900 bytes BTW. See [url=http://msdn.microsoft.com/en-us/library/ms191241.aspx][/url] for details.

Regards,
Ben
Go


Permissions

You can't post new topics.
You can't post topic replies.
You can't post new polls.
You can't post replies to polls.
You can't edit your own topics.
You can't delete your own topics.
You can't edit other topics.
You can't delete other topics.
You can't edit your own posts.
You can't edit other posts.
You can't delete your own posts.
You can't delete other posts.
You can't post events.
You can't edit your own events.
You can't edit other events.
You can't delete your own events.
You can't delete other events.
You can't send private messages.
You can't send emails.
You can read topics.
You can't vote in polls.
You can't upload attachments.
You can download attachments.
You can't post HTML code.
You can't edit HTML code.
You can't post IFCode.
You can't post JavaScript.
You can post emoticons.
You can't post or upload images.

Select a forum

































































































































































SQLServerCentral


Search