Clustered and NonClustered Index

Question

Clustered and NonClustered Index

er.mayankshukla

SSCarpal Tunnel

Points: 4190
More actions
January 8, 2016 at 2:28 am

#314649

I need an explanation may be pointing to some whitepaper for the explanation of below:
In a unique non clustered index, the clustered key comes to leaf level only
in a non unique non clustered index, the clustered key comes to leaf and non leaf level
I was watching Gail's video
https://www.youtube.com/watch?v=y_bl9dArtmA
So may be Gail can help me in this 🙂

Viewing 15 posts - 1 through 15 (of 24 total)

You must be logged in to reply to this topic. Login to reply

Hugo Kornelis SSC Guru Points: 64780 More actions · Answer 1

er.mayankshukla (1/8/2016)
I need an explanation may be pointing to some whitepaper for the explanation of below:
In a unique non clustered index, the clustered key comes to leaf level only
in a non unique non clustered index, the clustered key comes to leaf and non leaf level
I was watching Gail's video
https://www.youtube.com/watch?v=y_bl9dArtmA
So may be Gail can help me in this 🙂

I don't think this is correct. Any nonclustered index (whether unique or not) adds the clustered index column(s) to the leaf pages only.

You probably misunderstood something Gail said. I started to watch the video but it's a one-hour talk and I don't have that much time to spare.

Hugo Kornelis, SQL Server/Data Platform MVP (2006-2016)
Visit my SQL Server blog: https://sqlserverfast.com/blog/
SQL Server Execution Plan Reference: https://sqlserverfast.com/epr/

BLOB EATER SSChampion Points: 14609 More actions · Answer 2

Yeah I am interested in this as I thought for a non unique nonclustered index that the CI key needs to go up the tree in order to efficiently locate rows in the leaf level.

er.mayankshukla SSCarpal Tunnel Points: 4190 More actions · Answer 3

er.mayankshukla

SSCarpal Tunnel

Points: 4190

January 8, 2016 at 4:23 am

#1850450

Yes, May be Gail can answer this

Grant Fritchey SSC Guru Points: 398690 More actions · Answer 4

Yeah, I too thought the keys were stored at the leaf level. I'm positive Gail knows that. It must be a misstatement or misinterpretation of what she said.

"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt

Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning

Jeff Moden SSC Guru Points: 1003863 More actions · Answer 5

Hugo Kornelis (1/8/2016)
er.mayankshukla (1/8/2016)
I need an explanation may be pointing to some whitepaper for the explanation of below:
In a unique non clustered index, the clustered key comes to leaf level only
in a non unique non clustered index, the clustered key comes to leaf and non leaf level
I was watching Gail's video
https://www.youtube.com/watch?v=y_bl9dArtmA
So may be Gail can help me in this 🙂
I don't think this is correct. Any nonclustered index (whether unique or not) adds the clustered index column(s) to the leaf pages only.
You probably misunderstood something Gail said. I started to watch the video but it's a one-hour talk and I don't have that much time to spare.

The subject matter for this post starts at time marker 11:50.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Hugo Kornelis SSC Guru Points: 64780 More actions · Answer 6

Jeff Moden (1/8/2016)
The subject matter for this post starts at time marker 11:50.

Thanks, Jeff! Found it, watched it, and Gail is pretty explicit about it.

Coming from a lot of people, I would have put this aside as a misunderstanding from the presenter. Coming from Gail who normally knows her stuff very well, I will postpone judgement until after I have double checked this in a test database.

BRB!

Hugo Kornelis, SQL Server/Data Platform MVP (2006-2016)
Visit my SQL Server blog: https://sqlserverfast.com/blog/
SQL Server Execution Plan Reference: https://sqlserverfast.com/epr/

Jacob Wilkins One Orange Chip Points: 27976 More actions · Answer 7

This seems relevant as well: http://www.sqlskills.com/blogs/kimberly/nonclustered-indexes-lookup-key-btree/

Cheers!

Hugo Kornelis SSC Guru Points: 64780 More actions · Answer 8

Okay, did a small test and confirmed that my faith in Gail was justified (and my faith in my own understanding of index internals was not). After creating a test table and two nonclustered indexes, one unique and one nonunique, and adding 10,000 rows; I used DBCC IND to find the root pages and DBCC PAGE to examine their contents. The nonunique index did indeed have extra data added to each entry, and that data corresponds to the primary key values.

I then decided to check the documentation where I got my understanding from - and though phrased in a hard-to-understand way, I must admit that it's there and that I just had overlooked it so far. (See https://msdn.microsoft.com/en-us/library/ms190620.aspx - step 2, item 2 in the list, first and second paragraph).

Obviously, the next question is why this is implemented like this. Why would SQL Server need the clustered index key on the non-leaf levels? (Which is obviously also what the original question in this thread was after).

NOTE: This is where I enter into speculation mode!

Think about when a row is deleted (or updated affecting data in the nonclustered index). If the row is located through another index, then SQL Server needs to find it in this nonclustered index as well. The indexed value will be known (because updates and deletes always start from the clustered index), but now the right row has to be found. Now suppose that the row affected has a clustered index value of 12345, and the value in the nonclustered index is "green". The table has 1,000,000 entries, and approximately 25,000 of them have the value "green" in the indexed column. With the clustered index value in the leaf pages only, SQL Server has to seek to the first "green" row, read the clustered index value from the leaf page, check if it's the right one and repeat - it will probably have to visit a lot of leaf pages in that process.

So what I guess is that the clustered index value is added to the nonclustered index, not as just some additional data in the index but as an additional indexed column, and hence the data is added to the nonleaf pages as well. Now SQL Server can directly seek for row "green / 12345", and save a lot of IO.

Hugo Kornelis, SQL Server/Data Platform MVP (2006-2016)
Visit my SQL Server blog: https://sqlserverfast.com/blog/
SQL Server Execution Plan Reference: https://sqlserverfast.com/epr/

Hugo Kornelis SSC Guru Points: 64780 More actions · Answer 9

Thanks for the link, Jacob! That post confirms what I put out as speeculation.

Hugo Kornelis, SQL Server/Data Platform MVP (2006-2016)
Visit my SQL Server blog: https://sqlserverfast.com/blog/
SQL Server Execution Plan Reference: https://sqlserverfast.com/epr/

Jeff Moden SSC Guru Points: 1003863 More actions · Answer 10

Hugo Kornelis (1/8/2016)
Okay, did a small test and confirmed that my faith in Gail was justified (and my faith in my own understanding of index internals was not). After creating a test table and two nonclustered indexes, one unique and one nonunique, and adding 10,000 rows; I used DBCC IND to find the root pages and DBCC PAGE to examine their contents. The nonunique index did indeed have extra data added to each entry, and that data corresponds to the primary key values.
I then decided to check the documentation where I got my understanding from - and though phrased in a hard-to-understand way, I must admit that it's there and that I just had overlooked it so far. (See https://msdn.microsoft.com/en-us/library/ms190620.aspx - step 2, item 2 in the list, first and second paragraph).
Obviously, the next question is why this is implemented like this. Why would SQL Server need the clustered index key on the non-leaf levels? (Which is obviously also what the original question in this thread was after).
NOTE: This is where I enter into speculation mode!
Think about when a row is deleted (or updated affecting data in the nonclustered index). If the row is located through another index, then SQL Server needs to find it in this nonclustered index as well. The indexed value will be known (because updates and deletes always start from the clustered index), but now the right row has to be found. Now suppose that the row affected has a clustered index value of 12345, and the value in the nonclustered index is "green". The table has 1,000,000 entries, and approximately 25,000 of them have the value "green" in the indexed column. With the clustered index value in the leaf pages only, SQL Server has to seek to the first "green" row, read the clustered index value from the leaf page, check if it's the right one and repeat - it will probably have to visit a lot of leaf pages in that process.
So what I guess is that the clustered index value is added to the nonclustered index, not as just some additional data in the index but as an additional indexed column, and hence the data is added to the nonleaf pages as well. Now SQL Server can directly seek for row "green / 12345", and save a lot of IO.

Very well done, Hugo. Your research and the link that Jacob posted saved me a whole lot of time.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Jeff Moden SSC Guru Points: 1003863 More actions · Answer 11

Jacob Wilkins (1/8/2016)
This seems relevant as well: http://www.sqlskills.com/blogs/kimberly/nonclustered-indexes-lookup-key-btree/
Cheers!

Excellent reference. You can't do much better that getting it straight from the horse's mouth. Thanks, Jacob.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Gail Shaw SSC Guru Points: 1004504 More actions · Answer 12

er.mayankshukla (1/8/2016)
In a unique non clustered index, the clustered key comes to leaf level only
in a non unique non clustered index, the clustered key comes to leaf and non leaf level

Yup, that's correct. What clarification do you need?

Basically, if the nonclustered index is defined as UNIQUE, the clustered index key is treated as include columns (unless explicitly specified in the nonclustered index key), if the nonclustered index is not defined as unique, the clustered index key is treated like key columns and added at the end of the explicitly specified key.

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

Gail Shaw SSC Guru Points: 1004504 More actions · Answer 13

Edit: nm

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

er.mayankshukla SSCarpal Tunnel Points: 4190 More actions · Answer 14

Hi All,

I took some time to write a detail article on this, which might be useful for all for our reference

https://sqlmayank.wordpress.com/