Click here to monitor SSC
SQLServerCentral is supported by Redgate
 
Log in  ::  Register  ::  Not logged in
 
 
 


Index usage and RID lookup.


Index usage and RID lookup.

Author
Message
ben.brugman
ben.brugman
SSC-Addicted
SSC-Addicted (463 reputation)SSC-Addicted (463 reputation)SSC-Addicted (463 reputation)SSC-Addicted (463 reputation)SSC-Addicted (463 reputation)SSC-Addicted (463 reputation)SSC-Addicted (463 reputation)SSC-Addicted (463 reputation)

Group: General Forum Members
Points: 463 Visits: 2217
[b]GilaMonster (1/7/2013)

All processing is done in memory, regardless of whether it's an index seek, scan, lookup, etc. The query processor has no idea what a file is or how to do disk access.


Something has to do the disk accesses. And because the physical reads on disk is often the determining factor for throughput, this must be somehow included in the optimizer. (I am aware that the optimizer does not know what is present in the cache, but it does base it's optimizing on how many pages are needed).

I am not completely aware what the
query processor
query engine
RDBMS engine
SQL-server engine
Is.

For me the database is the part which is still there if you pull the plug out of the machine.
But do not realy understand the distinction between the above mentioned processors/engines.

For me the RDBMS engine does deliver the results and does take all the neccesary actions to get those results. (Both for select and mutation statements).

The query processor ??? processes the query (compilation and execution) I assume.

The query engine ??? invokes the query processor and gets the neccesary data ???

SQL-server engine is a RDBMS engine.

But for me the borders between the different parts are a bit Crazy vague. Sorry.

Ben
GilaMonster
GilaMonster
SSC-Forever
SSC-Forever (47K reputation)SSC-Forever (47K reputation)SSC-Forever (47K reputation)SSC-Forever (47K reputation)SSC-Forever (47K reputation)SSC-Forever (47K reputation)SSC-Forever (47K reputation)SSC-Forever (47K reputation)

Group: General Forum Members
Points: 47091 Visits: 44344
ben.brugman (1/7/2013)
Something has to do the disk accesses.


The storage engine

And because the physical reads on disk is often the determining factor for throughput, this must be somehow included in the optimizer.


The optimiser assumes that none of the required data will be in cache.

You have (in general)
The optimiser - responsible for generating the execution plans
The query processor - executes the plans (also called the query execution engine)
The storage engine - responsible for all disk access, for giving rows to the query processor, for handling transactions and a whole bunch more.

The query processor (what actually executes queries) has no clue what a disk or file is. It asks the storage engine for the rows it needs. If the rows are on pages in cache, the storage engine just gives it rows. If they're on pages not in cache, the storage engine fetches them into cache, then reads the rows off and hands them to the query processor


Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass


ScottPletcher
ScottPletcher
Hall of Fame
Hall of Fame (3.9K reputation)Hall of Fame (3.9K reputation)Hall of Fame (3.9K reputation)Hall of Fame (3.9K reputation)Hall of Fame (3.9K reputation)Hall of Fame (3.9K reputation)Hall of Fame (3.9K reputation)Hall of Fame (3.9K reputation)

Group: General Forum Members
Points: 3937 Visits: 6661
With a clustered index on ( A, B, C ) and a


WHERE A = 6 AND C = 7




I would expect SQL to a clustered index seek.

That is, it will do an indexed search to go straight to the "A = 6" rows and read just those; no other (non-index) rows would have to be read. SQL will search those rows looking for "C = 7".


With only nonclustered indexes, the process is more involved.

SQL first has to determine if you have a "covering index": that is, an index with ALL columns used in the query. If so, SQL will use that; if multiple indexes with all columns, SQL uses the one with the shortest row.

If there is no fully covering index, SQL will decide if an index with all the WHERE column(s) is available, and if it is worth using. Since reads using a nonclustered index involve lots of additional I/O -- SQL must use the rows in the nonclus index to go back and do random reads on the parent row in the heap/clus index -- there is a "tipping point" where SQL won't use a nonclustered index at all and instead just scan the table itself.

There is no specific % of rows where SQL will or won't use a nonclus index. The exact % varies. For example, if only 1% of rows have "A = 6", SQL is almost certain to use the nonclus index. If 90% of the rows have "A = 6", SQL is almost certain to do a table scan. Typically the "tipping point" is somewhere between 10% and 30% of rows, but even that is just a rough guideline and could vary for specific situations.

SQL DBA,SQL Server MVP('07, '08, '09)

Prosecutor James Blackburn, in closing argument in the "Fatal Vision" murders trial: "If in the future, you should cry a tear, cry one for them [the murder victims]. If in the future, you should say a prayer, say one for them. And if in the future, you should light a candle, light one for them."
GilaMonster
GilaMonster
SSC-Forever
SSC-Forever (47K reputation)SSC-Forever (47K reputation)SSC-Forever (47K reputation)SSC-Forever (47K reputation)SSC-Forever (47K reputation)SSC-Forever (47K reputation)SSC-Forever (47K reputation)SSC-Forever (47K reputation)

Group: General Forum Members
Points: 47091 Visits: 44344
ScottPletcher (1/7/2013)
If so, SQL will use that; if multiple indexes with all columns, SQL uses the one with the shortest row.


The most selective one (the one that, based on the statistics, is going to return the fewest rows after the seek). If multiple indexes are equally selective, then the one with the fewest number of pages.

For example, if only 1% of rows have "A = 6", SQL is almost certain to use the nonclus index.


If 1% of the rows have A=6 AND C=7 (the index has both columns, therefore both conditions limit the number of rows needing a key lookup), SQL is almost certain to do a table scan.

Typically the "tipping point" is somewhere between 10% and 30% of rows, but even that is just a rough guideline and could vary for specific situations.


The tipping point is where the number of rows needing a lookup = 30% of the number of pages in the table. Usually this is somewhere around 0.1% - 0.5% of the table, but depends on the size of the rows.


Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass


ScottPletcher
ScottPletcher
Hall of Fame
Hall of Fame (3.9K reputation)Hall of Fame (3.9K reputation)Hall of Fame (3.9K reputation)Hall of Fame (3.9K reputation)Hall of Fame (3.9K reputation)Hall of Fame (3.9K reputation)Hall of Fame (3.9K reputation)Hall of Fame (3.9K reputation)

Group: General Forum Members
Points: 3937 Visits: 6661
GilaMonster (1/7/2013)
ScottPletcher (1/7/2013)
If so, SQL will use that; if multiple indexes with all columns, SQL uses the one with the shortest row.


The most selective one (the one that, based on the statistics, is going to return the fewest rows after the seek). If multiple indexes are equally selective, then the one with the fewest number of pages.

For example, if only 1% of rows have "A = 6", SQL is almost certain to use the nonclus index.


If 1% of the rows have A=6 AND C=7 (the index has both columns, therefore both conditions limit the number of rows needing a key lookup), SQL is almost certain to do a table scan.

Typically the "tipping point" is somewhere between 10% and 30% of rows, but even that is just a rough guideline and could vary for specific situations.


The tipping point is where the number of rows needing a lookup = 30% of the number of pages in the table. Usually this is somewhere around 0.1% - 0.5% of the table, but depends on the size of the rows.



Fewest rows? The number of rows that match all the criteria will be the same regardless of the index used, right?

What is the basis for you claim that if only 1% of rows in the table match that SQL will almost certainly do a table scan? If the table has 1M rows, but only 10K match, you state SQL is almost certain to do a table scan??


Where did you get that 30% is a hard number for a tipping point? How could 0.1% of the rows in a table equal 30% of the table pages??

SQL DBA,SQL Server MVP('07, '08, '09)

Prosecutor James Blackburn, in closing argument in the "Fatal Vision" murders trial: "If in the future, you should cry a tear, cry one for them [the murder victims]. If in the future, you should say a prayer, say one for them. And if in the future, you should light a candle, light one for them."
GilaMonster
GilaMonster
SSC-Forever
SSC-Forever (47K reputation)SSC-Forever (47K reputation)SSC-Forever (47K reputation)SSC-Forever (47K reputation)SSC-Forever (47K reputation)SSC-Forever (47K reputation)SSC-Forever (47K reputation)SSC-Forever (47K reputation)

Group: General Forum Members
Points: 47091 Visits: 44344
ScottPletcher (1/7/2013)
Fewest rows? The number of rows that match all the criteria will be the same regardless of the index used, right?


Match all criteria, yes. What I was thinking about is the case where there are multiple indexes, none of which have all the criteria.

SELECT <columns> FROM SomeTable WHERE Col1 = @A and Col2 = @B

and we have a two indexes, one on Col1 and one on Col2. The one that SQL will pick to seek is the one that is estimated to return the fewest rows so that SQL can do the fewest lookups in order to fetch the other column(s), for the secondary filter and the column list if necessary

If all the indexes have the columns for all the criteria, then the index used will be the one with the fewest leaf pages.

What is the basis for you claim that if only 1% of rows in the table match that SQL will almost certainly do a table scan? If the table has 1M rows, but only 10K match, you state SQL is almost certain to do a table scan??


Experience and testing. Yes, SQL is almost certain to scan at 1% of the rows in the table if it does not have a covering index to seek on (Almost certain. I can probably conjure an example where at 1% with accurate row estimations seeks and key lookups are done). Feel free to test it, or to see the many, many blog posts and articles on this subject.

Where did you get that 30% is a hard number for a tipping point?


iirc it's in Kalen's book. Not in the mood for hunting for it tonight.

How could 0.1% of the rows in a table equal 30% of the table pages??


Let's say rows of 100 bytes each. That's 80 per page. 1 000 000 rows is 12 500 pages. 30% of that is 3750 pages. 3750 rows would be 0.375% of the total rows in the table.

In case you misunderstood my comment, tipping point is when the number of rows to be looked up is equal to 30% of the total page count in the table


Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass


ScottPletcher
ScottPletcher
Hall of Fame
Hall of Fame (3.9K reputation)Hall of Fame (3.9K reputation)Hall of Fame (3.9K reputation)Hall of Fame (3.9K reputation)Hall of Fame (3.9K reputation)Hall of Fame (3.9K reputation)Hall of Fame (3.9K reputation)Hall of Fame (3.9K reputation)

Group: General Forum Members
Points: 3937 Visits: 6661
:-)
GilaMonster (1/7/2013)
What is the basis for you claim that if only 1% of rows in the table match that SQL will almost certainly do a table scan? If the table has 1M rows, but only 10K match, you state SQL is almost certain to do a table scan??


Experience and testing. Yes, SQL is almost certain to scan at 1% of the rows in the table if it does not have a covering index to seek on (Almost certain. I can probably conjure an example where at 1% with accurate row estimations seeks and key lookups are done). Feel free to test it, or to see the many, many blog posts and articles on this subject.

Where did you get that 30% is a hard number for a tipping point?


iirc it's in Kalen's book. Not in the mood for hunting for it tonight.

How could 0.1% of the rows in a table equal 30% of the table pages??


Let's say rows of 100 bytes each. That's 80 per page. 1 000 000 rows is 12 500 pages. 30% of that is 3750 pages. 3750 rows would be 0.375% of the total rows in the table.

In case you misunderstood my comment, tipping point is when the number of rows to be looked up is equal to 30% of the total page count in the table


I'm pretty sure the ~30% is the default, but based on what I've read, that is not a hard value because other factors affect it, including memory available, table size, row size, I/O affinity and parallelism.

0.375% is almost 4 times 0.1%; yes, the number goes very low, but I don't think it reaches quite down to 0.1% .

SQL DBA,SQL Server MVP('07, '08, '09)

Prosecutor James Blackburn, in closing argument in the "Fatal Vision" murders trial: "If in the future, you should cry a tear, cry one for them [the murder victims]. If in the future, you should say a prayer, say one for them. And if in the future, you should light a candle, light one for them."
GilaMonster
GilaMonster
SSC-Forever
SSC-Forever (47K reputation)SSC-Forever (47K reputation)SSC-Forever (47K reputation)SSC-Forever (47K reputation)SSC-Forever (47K reputation)SSC-Forever (47K reputation)SSC-Forever (47K reputation)SSC-Forever (47K reputation)

Group: General Forum Members
Points: 47091 Visits: 44344
ScottPletcher (1/7/2013)
0.375% is almost 4 times 0.1%; yes, the number goes very low, but I don't think it reaches quite down to 0.1% .


I did say "Usually somewhere around 0.1%-0.5%"

p.s. I have seen the tipping point as low as 0.15%


Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass


ben.brugman
ben.brugman
SSC-Addicted
SSC-Addicted (463 reputation)SSC-Addicted (463 reputation)SSC-Addicted (463 reputation)SSC-Addicted (463 reputation)SSC-Addicted (463 reputation)SSC-Addicted (463 reputation)SSC-Addicted (463 reputation)SSC-Addicted (463 reputation)

Group: General Forum Members
Points: 463 Visits: 2217
Thanks to all contributing to this thread, especially Gail and the others who triggered her in providing more information.

This made me wonder about Segments, do they come into the 'picture' at all with the discussed aspects. Does the optimizer 'know' about segments?
Or are segment only a way to 'bundle' pages and therefore help to limit the fragmentation?

Thanks for all the input,
Ben Brugman
GilaMonster
GilaMonster
SSC-Forever
SSC-Forever (47K reputation)SSC-Forever (47K reputation)SSC-Forever (47K reputation)SSC-Forever (47K reputation)SSC-Forever (47K reputation)SSC-Forever (47K reputation)SSC-Forever (47K reputation)SSC-Forever (47K reputation)

Group: General Forum Members
Points: 47091 Visits: 44344
ben.brugman (1/8/2013)
This made me wonder about Segments, do they come into the 'picture' at all with the discussed aspects. Does the optimizer 'know' about segments?
Or are segment only a way to 'bundle' pages and therefore help to limit the fragmentation?


Segments?


Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass


Go


Permissions

You can't post new topics.
You can't post topic replies.
You can't post new polls.
You can't post replies to polls.
You can't edit your own topics.
You can't delete your own topics.
You can't edit other topics.
You can't delete other topics.
You can't edit your own posts.
You can't edit other posts.
You can't delete your own posts.
You can't delete other posts.
You can't post events.
You can't edit your own events.
You can't edit other events.
You can't delete your own events.
You can't delete other events.
You can't send private messages.
You can't send emails.
You can read topics.
You can't vote in polls.
You can't upload attachments.
You can download attachments.
You can't post HTML code.
You can't edit HTML code.
You can't post IFCode.
You can't post JavaScript.
You can post emoticons.
You can't post or upload images.

Select a forum

































































































































































SQLServerCentral


Search