July 22, 2009 at 2:18 pm
Hi.
In my script, the execution plan chose index seek in one case, but I want to test it with scan without changing the index itself (because I have a feeling that in this case scan might be better).
Is it possible with hint ? And how can I put it ?
Thanks
July 22, 2009 at 2:29 pm
Why? Seeks are faster than scans.
July 22, 2009 at 9:17 pm
Lynn Pettis (7/22/2009)
Why? Seeks are faster than scans.
Heh... because "A developer must not guess... a developer must KNOW." 😉 It'll be a nice test.
--Jeff Moden
Change is inevitable... Change for the better is not.
July 22, 2009 at 9:18 pm
SQL Guy (7/22/2009)
Hi.In my script, the execution plan chose index seek in one case, but I want to test it with scan without changing the index itself (because I have a feeling that in this case scan might be better).
Is it possible with hint ? And how can I put it ?
Thanks
I believe that the only index you can force to scan is the clustered index. You'd do that with a [font="Arial Black"]WITH(INDEX(0)) [/font]hint.
--Jeff Moden
Change is inevitable... Change for the better is not.
July 23, 2009 at 6:55 am
Jeff Moden (7/22/2009)
Lynn Pettis (7/22/2009)
Why? Seeks are faster than scans.Heh... because "A developer must not guess... a developer must KNOW." 😉 It'll be a nice test.
If it isn't true, there are several people on SSC that I've heard this from that I trust that may need to revisit this as well.
Hmm, how would one go about setting up a test like this...
July 23, 2009 at 7:19 am
The only time a scan can be faster than a seek is when there are multiple seeks against the same table (in clause, multiple ORs, inner table of a nested loop join). In other cases the scan can only ever be as fast as a seek, and that's in the case where the seek predicate matches all the rows in the table.
If you want to force a scan, put a function on the column that SQL's using for the seek. Add 0 if it's a number, add '' if it's a string. That's sufficient to disallow an index seek. Hint the index as well, because SQL may decide that it would rather scan (or seek) a different index from the one that you want to test.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
July 24, 2009 at 6:40 am
Lynn Pettis (7/23/2009)
Jeff Moden (7/22/2009)
Lynn Pettis (7/22/2009)
Why? Seeks are faster than scans.Heh... because "A developer must not guess... a developer must KNOW." 😉 It'll be a nice test.
If it isn't true, there are several people on SSC that I've heard this from that I trust that may need to revisit this as well.
Hmm, how would one go about setting up a test like this...
Everyone must know and not guess 🙂
Y'all should know better than to make such absolute statements! It depends!!
There are at least four cases where a scan may be as fast or faster than a seek (over the same, or very similar, number of records):
1. Where the table fits on a single page
2. The 'seek index' has a fill factor so low that it contains more pages than the heap/clustered index
3. The 'seek index' is very highly fragmented and the heap/clustered index is contiguous.
4. The query would benefit from parallelism, but the index has to be scanned backward.
Other points to consider:
Heaps are scanned in allocation order using the IAMs. This can be faster than scanning an index in logical page order if there is enough fragmentation. Clustered indexes can also be scanned using IAMs if a table lock is used or the read uncommitted isolation level is in effect.
Read-ahead will be more effective on physically contiguous data (especially on Enterprise/Developer/Trial Editions).
A scan over uniform extents can be faster than a seek over pages allocated from mixed extents.
Script below which demonstrates a scan outperforming a seek in terms of total execution time, physical reads, and logical reads.
Results on my system:
Seek:
physical reads: 283
logical reads: 11,197
worker time (cpu): 300ms
elapsed time: 3.5s
Scan:
physical reads: 133
logical reads: 7,989
worker time (cpu): 825ms
elapsed time: 1.8s
--
-- Run in a 'real' database, not tempdb, for best effect
-- ONLY run on a TEST system which you have EXCLUSIVE access to
--
ifOBJECT_ID(N'dbo.Invoice', N'U') IS NOT NULL
drop table dbo.Invoice
go
--
-- Test table
--
create table
dbo.Invoice
(
invoice_id int identity primary key nonclustered,
invoice_number char(10) not null,
issue_date datetime not null,
due_date datetime not null,
supplier_code char(4) not null
);
go
--
-- Index to seek with
--
create nonclustered index nc1 on dbo.Invoice (invoice_number) with (fillfactor = 100);
go
--
-- Add 1.5M rows of test data
--
insert
dbo.Invoice with (tablockx)
(
invoice_number,
issue_date,
due_date,
supplier_code
)
select
top (1500000)
invoice_number = LEFT(NEWID(), 10),
issue_date = DATEADD(DAY, RAND() * -14 - 14, CURRENT_TIMESTAMP),
due_date = DATEADD(DAY, RAND() * 14 + 14, CURRENT_TIMESTAMP),
supplier_code = RIGHT(NEWID(), 4)
from
master.sys.all_columns A1,
master.sys.all_columns A2,
master.sys.all_columns A3;
go
--
-- Fragment the non-clustered index, and split index pages
--
updatedbo.Invoice
setinvoice_number = REVERSE(invoice_number)
whereinvoice_id = any (select top (100) I2.invoice_id from dbo.Invoice I2)
go
--
-- Show the fragmentation
--
select index_type_desc, avg_fragmentation_in_percent, avg_page_space_used_in_percent, page_count, record_count
from sys.dm_db_index_physical_stats(DB_ID(), OBJECT_ID(N'dbo.Invoice', N'U'), NULL, NULL, 'DETAILED')
where index_level = 0;
dbcc freesystemcache('SQL Plans')
go
checkpoint
dbcc dropcleanbuffers
go
set statistics io on
go
-- Seek
selectCOUNT_BIG(*)
fromdbo.Invoice
whereinvoice_number > ''
go
checkpoint
dbcc dropcleanbuffers
go
-- Scan
selectCOUNT_BIG(*)
fromdbo.Invoice with (index(0))
whereinvoice_number between SPACE(10) AND 'ZZZZZZZZZZ'
go
set statistics io off
select text, execution_count, total_worker_time, total_physical_reads, total_logical_reads, total_elapsed_time
from sys.dm_exec_query_stats qs
cross apply sys.dm_exec_sql_text (qs.sql_handle) t
where t.text like '%dbo.Invoice%'
and t.text not like '%sys.dm_exec_query_stats%'
order by last_execution_time asc;
July 24, 2009 at 6:43 am
GilaMonster (7/23/2009)
If you want to force a scan, put a function on the column that SQL's using for the seek. Add 0 if it's a number, add '' if it's a string. That's sufficient to disallow an index seek. Hint the index as well, because SQL may decide that it would rather scan (or seek) a different index from the one that you want to test.
Putting a function on the column will almost always prevent an index seek.
Simply adding a zero or empty string still results in a seek if the resulting expression can be constant folded*.
* edit: Oh, I see - you meant add the zero or empty string to the column name not the value to compare to! Yeah, that works 🙂
July 24, 2009 at 8:05 am
Paul White (7/24/2009)
There are at least four cases where a scan may be as fast or faster than a seek (over the same, or very similar, number of records):1. Where the table fits on a single page
2. The 'seek index' has a fill factor so low that it contains more pages than the heap/clustered index
3. The 'seek index' is very highly fragmented and the heap/clustered index is contiguous.
4. The query would benefit from parallelism, but the index has to be scanned backward.
Well, I was specifically talking about a seek on a particular index vs a scan on the same index, not a seek on a NC index vs a scan on a cluster. It's easily possible for a scan of the cluster/heap to be faster than a seek on a nonclustered index.
If, however, you're talking (as I interpreted the OP to be doing) about a seek on IndexX vs a scan on IndexX, then none of those cases apply (maybe the parallelism one, haven't checked)
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
July 24, 2009 at 9:40 am
Thanks, Jeff, I used your hint WITH(INDEX(0)) and found that scan was slower in my particular query.
When I initially put my post, I did not mean to insist that scan will outpeform seek, in 95% of all cases you are all right, seek is far better. I just wanted to test it. And my intention was to find out how to hint it.
July 24, 2009 at 4:37 pm
Thanks for the feedback. I learned some stuff myself. Good thread! 🙂
--Jeff Moden
Change is inevitable... Change for the better is not.
July 24, 2009 at 5:00 pm
Paul White (7/24/2009)
Lynn Pettis (7/23/2009)
Jeff Moden (7/22/2009)
Lynn Pettis (7/22/2009)
Why? Seeks are faster than scans.Heh... because "A developer must not guess... a developer must KNOW." 😉 It'll be a nice test.
If it isn't true, there are several people on SSC that I've heard this from that I trust that may need to revisit this as well.
Hmm, how would one go about setting up a test like this...
Everyone must know and not guess 🙂
When I ran your good code, Paul, I got just the opposite results overall...
total_worker_time total_physical_reads total_logical_reads total_elapsed_time
392055 106 3942 1256268 --Seek
959229 1027 8001 3325338 --Scan
Perhaps a good reason for that is my system did a table scan instead of an index scan.:-P
--Jeff Moden
Change is inevitable... Change for the better is not.
July 25, 2009 at 12:37 am
Jeff Moden (7/24/2009)
When I ran your good code, Paul, I got just the opposite results overall...
total_worker_time total_physical_reads total_logical_reads total_elapsed_time
392055 106 3942 1256268 --Seek
959229 1027 8001 3325338 --Scan
Perhaps a good reason for that is my system did a table scan instead of an index scan.:-P
It is a table scan I was after. It was a table scan that out-performed the seek in my original test results.
It can be quite tricky to reproduce - though I found it easier in a 'real' database than in tempdb for some reason. You might need to fragment the nc index even more, maybe?
Thanks for taking the time to run it though. I realize my point was slightly off-topic, but I couldn't resist the opportunity to discuss some of the subtleties. A fun thread.
Paul
July 25, 2009 at 9:45 pm
Paul White (7/25/2009)
I realize my point was slightly off-topic, but I couldn't resist the opportunity to discuss some of the subtleties.
Speaking of "off topic"... that's part of the reason why I don't ever want them to allow the OP to close a thread... the discussions that sometimes follow are worth as much as the original problem. Good tangent, Gail and Paul.
--Jeff Moden
Change is inevitable... Change for the better is not.
Viewing 14 posts - 1 through 14 (of 14 total)
You must be logged in to reply to this topic. Login to reply