Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 
        
Home       Members    Calendar    Who's On


Add to briefcase 123»»»

Index usage and RID lookup. Expand / Collapse
Author
Message
Posted Monday, January 7, 2013 6:23 AM
SSC Veteran

SSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC Veteran

Group: General Forum Members
Last Login: Monday, July 21, 2014 2:19 AM
Points: 246, Visits: 1,166
Hello All,

An index with three fields, A,B,C
(

A selection with

SELECT * FROM A_TABLE WHERE A = 6 AND C =7

performance question:
Does this read all rows from A with a 6 in it. So RID lookup of all A = 6 rows?
Or does it use in the index the C field to 'filter' out only the C fields with the value?
(Is the filtering of the C part done on index level or on row level ?)

Some code has been attached for testing, with an example optimised for the index and an example as above. Offcourse the optimised query runs much faster, but this does not anwser my question.

The code can be tuned easely to produce more rows or another distribution of rows.
(The code as supplied produces (112669 row(s) affected))
(Resultset (30 row(s) affected))


-- Example:
-- Ben Brugman
-- 20130107
--

-- Altered from the first Posting, NOW heap is default.
-- First few anwsered to this mail had the cluster as default.
--
-- Example to test index behavior.
-- Runs as supplied. (Heap table is measured).
-- Comment out or run relevant parts.
--

/* BUILD CLUSTER */
/* CLUSTER */
/* CLUSTER */
/* To prevent any potential data loss issues, you should review this script in detail before running it outside the context of the database designer.*/
BEGIN TRANSACTION
GO
CREATE TABLE dbo.A_Table
(
A int NOT NULL,
B int NOT NULL,
C int NOT NULL,
Text_field varchar(300) NULL,
Number_off_other_fields varchar(300) NULL
) ON [PRIMARY]
GO
ALTER TABLE dbo.A_Table ADD CONSTRAINT
PK_A_Table PRIMARY KEY CLUSTERED
(
A,
B,
C
) WITH( STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]

GO
ALTER TABLE dbo.A_Table SET (LOCK_ESCALATION = TABLE)
GO
COMMIT

--
-- Script creates the A_table twice first as heap second as clustered table.
-- Run the appropriete parts of the script. (Select parts or comment out parts).
--
drop table A_Table

/* BUILD HEAP */
/* HEAP */
/* HEAP */
/* To prevent any potential data loss issues, you should review this script in detail before running it outside the context of the database designer.*/

BEGIN TRANSACTION
GO
CREATE TABLE dbo.A_Table
(
A int NOT NULL,
B int NOT NULL,
C int NOT NULL,
Text_field varchar(300) NULL,
Number_off_other_fields varchar(300) NULL
) ON [PRIMARY]
GO
ALTER TABLE dbo.A_Table ADD CONSTRAINT
PK_A_Table PRIMARY KEY NONCLUSTERED
(
A,
B,
C
) WITH( STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]

GO
ALTER TABLE dbo.A_Table SET (LOCK_ESCALATION = TABLE)
GO
COMMIT

/*GENERATE DATA*/
/*GENERATE DATA*/
/*GENERATE DATA*/
;
WITH
L0 AS(SELECT 0 AS c UNION ALL SELECT 0), -- 2
L1 AS(select 0 as x from L0 A, L0 b), -- 2 ^2 = 4
L2 AS(select 0 as x from L1 A, L1 b), -- 4 ^2 = 16
L3 AS(select 0 as x from L2 A, L2 b), -- 16 ^2 = 256
L4 AS(select 0 as x from L3 A, L3 b), -- 256 ^2 = 65536
L5 AS(select 0 as x from L4 A, L4 b), -- 65536 ^2 = ruim 4 E9
L9 AS(Select *, row_number() OVER(PARTITION BY x order by x ) as nr from L5) -- voeg rijnummers toe
insert into A_Table
SELECT
TA.NR,
TB.NR,
TC.NR,
CONVERT(VARCHAR(300),TA.NR)+' '+
CONVERT(VARCHAR(300),TB.NR)+' '+
CONVERT(VARCHAR(300),TC.NR)+' ',
'sum '+CONVERT(VARCHAR(300),TA.NR+TB.nr+TC.nr)+' product '+ CONVERT(VARCHAR(300),TA.NR*TB.nr*TC.nr)
from L9 TA, L9 TB, L9 TC where
TA.nr*TB.nr <1000 and -- change numbers to
TA.nr*TC.nr <1000 and -- change number of rows
Tb.nr*TC.nr <1000 and -- or influence distribution
TA.nr <1000 and
TB.nr <1000 and
TC.nr <1000
-- the overdetermined where clause is more efficient than 'short' where clause.

select top 3000 * from A_Table

SET STATISTICS TIME ON
SET STATISTICS IO ON

/*BENCHMARK OR TEST*/
/*BENCHMARK OR TEST*/
/*BENCHMARK OR TEST*/

dbcc dropcleanbuffers
dbcc freeproccache
--
-- selection in order of the fields.
--
select * from A_Table where A = 33 and B = 26 -- Optimised for the index.

dbcc dropcleanbuffers
dbcc freeproccache
--
-- selection of fields which are not in the order of the index
--
select * from A_Table where A = 33 and C = 26 -- Not optimised for th index

drop table A_Table -- Clean up

Post #1403569
Posted Monday, January 7, 2013 6:43 AM


SSC-Insane

SSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-Insane

Group: General Forum Members
Last Login: Today @ 1:22 PM
Points: 23,011, Visits: 31,519
Actually, with the code you posted there is no difference. You index is a clustered index meaning that both queries use the clustered index. Looking at the actual execution plan, both do a clustered index scan (which is also a table scan as the clustered index is the table).


Lynn Pettis

For better assistance in answering your questions, click here
For tips to get better help with Performance Problems, click here
For Running Totals and its variations, click here or when working with partitioned tables
For more about Tally Tables, click here
For more about Cross Tabs and Pivots, click here and here
Managing Transaction Logs

SQL Musings from the Desert Fountain Valley SQL (My Mirror Blog)
Post #1403576
Posted Monday, January 7, 2013 6:50 AM
SSC Veteran

SSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC Veteran

Group: General Forum Members
Last Login: Monday, July 21, 2014 2:19 AM
Points: 246, Visits: 1,166
Lynn Pettis (1/7/2013)
You index is a clustered index meaning that both queries use the clustered index.


The code for a Heap situation is included in the code as wel.
I should have made the Heap the default, sorry for that.

(heap is build, dropped, cluster is build. commenting out the drop and the cluster build will give the Heap situation).

With the size's provided the cluster does do a clustered search but there is a difference in efficiency. Because there is no other index on the cluster the access will offcourse be a cluster scan. (complete or part).
But cluster should not have been de default in the script.

I included the clustering, to check for performance on that as well.

Thanks for your time and attention,
Sorry to have choosen the wrong default.
Ben Brugman
Post #1403578
Posted Monday, January 7, 2013 6:59 AM


SSC-Insane

SSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-Insane

Group: General Forum Members
Last Login: Today @ 1:22 PM
Points: 23,011, Visits: 31,519
ben.brugman (1/7/2013)
Lynn Pettis (1/7/2013)
You index is a clustered index meaning that both queries use the clustered index.


The code for a Heap situation is included in the code as wel.
I should have made the Heap the default, sorry for that.

(heap is build, dropped, cluster is build. commenting out the drop and the cluster build will give the Heap situation).

With the size's provided the cluster does do a clustered search but there is a difference in efficiency. Because there is no other index on the cluster the access will offcourse be a cluster scan. (complete or part).
But cluster should not have been de default in the script.

I included the clustering, to check for performance on that as well.

Thanks for your time and attention,
Sorry to have choosen the wrong default.
Ben Brugman



Taking a closer look at your code I see that now. You tell us to run your code from further below. Sorry, I missed that. I did a copy/paste/run of your code as is. Perhaps you should put your test code where it is needed instead of relying on us to move code around for you.



Lynn Pettis

For better assistance in answering your questions, click here
For tips to get better help with Performance Problems, click here
For Running Totals and its variations, click here or when working with partitioned tables
For more about Tally Tables, click here
For more about Cross Tabs and Pivots, click here and here
Managing Transaction Logs

SQL Musings from the Desert Fountain Valley SQL (My Mirror Blog)
Post #1403581
Posted Monday, January 7, 2013 7:24 AM
SSC Veteran

SSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC Veteran

Group: General Forum Members
Last Login: Monday, July 21, 2014 2:19 AM
Points: 246, Visits: 1,166
Lynn Pettis (1/7/2013)

Taking a closer look at your code I see that now. You tell us to run your code from further below. Sorry, I missed that. I did a copy/paste/run of your code as is. Perhaps you should put your test code where it is needed instead of relying on us to move code around for you.


As you said I should have taken the 'other' default.
I changed it in the original posting so now the heap is default.
A copy/paste/run now works better.

Thanks for your advice,
Ben Brugman
Post #1403595
Posted Monday, January 7, 2013 7:31 AM


SSC-Insane

SSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-Insane

Group: General Forum Members
Last Login: Today @ 1:22 PM
Points: 23,011, Visits: 31,519
Your code still only runs the test code for one of the indexes not both. If I want the execution plans for the queries using the clustered index now I still have to copy your code around to do it.

Both queries using the nonclustered index use the index (although the second query says an index on ColA, ColC would help). The second has to read more data from the index before going to the heap for further data which is why it probably runs a bit slower than the first.



Lynn Pettis

For better assistance in answering your questions, click here
For tips to get better help with Performance Problems, click here
For Running Totals and its variations, click here or when working with partitioned tables
For more about Tally Tables, click here
For more about Cross Tabs and Pivots, click here and here
Managing Transaction Logs

SQL Musings from the Desert Fountain Valley SQL (My Mirror Blog)
Post #1403600
Posted Monday, January 7, 2013 8:03 AM


SSC-Forever

SSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-Forever

Group: General Forum Members
Last Login: Today @ 1:44 PM
Points: 42,454, Visits: 35,509
ben.brugman (1/7/2013)

Does this read all rows from A with a 6 in it. So RID lookup of all A = 6 rows?


No. It reads all rows from the index where A=6, filters them as it reads for C =7. Once it has all rows that match all conditions that have columns within the index, if the index is not covering the lookups are done to fetch the rest of the columns.



Gail Shaw
Microsoft Certified Master: SQL Server 2008, MVP
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

Post #1403630
Posted Monday, January 7, 2013 8:08 AM
SSC Veteran

SSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC Veteran

Group: General Forum Members
Last Login: Monday, July 21, 2014 2:19 AM
Points: 246, Visits: 1,166
Lynn Pettis (1/7/2013)
Your code still only runs the test code for one of the indexes not both. If I want the execution plans for the queries using the clustered index now I still have to copy your code around to do it.

Both queries using the nonclustered index use the index (although the second query says an index on ColA, ColC would help). The second has to read more data from the index before going to the heap for further data which is why it probably runs a bit slower than the first.


Sorry that I am not able to tell my question more clearly,
I am not asking the difference between two types of indexes,
but my question is how with a A,B,C index an index is processed if only A and C are used in the Where clause.

Is the selection of the C done in the index tree?
Or is the selection of the C done after the pages have been read in to memory as a filter?


The example was given as a starting point, because often code is asked. The question is independend of the code, but the code is used to clearify (not succeeding) the question and providing a starting point.

It is possible that because of the B field missing in the selection the C field is not used while reading the rows into memory. And that the wrong C rows are filtered out as an 'afterthought'.
It is also possible that at a specific moment while using the index the engine can exclude rows on that they do not fit the C.
Both this may also be dependend on the size of the table or de index. The example produces over 100 000 rows (easely changed), but results might differ for differenc sizes of rows and/or tables.

Hope this does make sense,
Ben Brugman

Post #1403632
Posted Monday, January 7, 2013 8:10 AM
SSC Veteran

SSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC Veteran

Group: General Forum Members
Last Login: Monday, July 21, 2014 2:19 AM
Points: 246, Visits: 1,166
[b]GilaMonster (1/7/2013)
No. It reads all rows from the index where A=6, filters them as it reads for C =7. Once it has all rows that match all conditions that have columns within the index, if the index is not covering the lookups are done to fetch the rest of the columns.


Thanks this anwsers my question,
Ben,
Post #1403635
Posted Monday, January 7, 2013 8:11 AM


SSC-Forever

SSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-Forever

Group: General Forum Members
Last Login: Today @ 1:44 PM
Points: 42,454, Visits: 35,509
ben.brugman (1/7/2013)

Or is the selection of the C done after the pages have been read in to memory as a filter?


All processing is done in memory, regardless of whether it's an index seek, scan, lookup, etc. The query processor has no idea what a file is or how to do disk access.



Gail Shaw
Microsoft Certified Master: SQL Server 2008, MVP
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

Post #1403636
« Prev Topic | Next Topic »

Add to briefcase 123»»»

Permissions Expand / Collapse