Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 

LivingForSqlServer

I am Ramkumar, 34 years old Consultant, trainer, blogger and speaker at SQL Server user group in India. I have more than 10 years of experience as a developer and SQL Server DBA. I love reading, teaching and blogging about SQL Server internals and performance tuning. My Facebook page: https://www.facebook.com/LivingForSqlServer

SQL Server Storage Internals Part 5 - Properties of Clustered and Non Clustered Index pages

In this article, I thought of explaining different types and levels of index pages. but while start preparing for that, I realise , newbies need to understand concepts on Index architecture before moving on to see what is there inside index pages.

So, before going see internals of Clustered and NonClustered index pages, lets quickly recap some key points on Clustered and Non Clustered Indexes and its Architecture. This will help Us to quickly understand internals of different types and levels of index pages.

Note : Points given here are taken from msdn and I 've used a suitable query from MS press SQL Server 2008 Internals book.
Please have a look at reference section for more details.

Common Properties of Clustered and Non Clustered Indexes:

1. Indexes are organized as B-trees.
2. Each page in an index B-tree is called an index node.
3. The top node of the B-tree is called the root node.
4. The bottom level of nodes in the index is called the leaf nodes.
5. Any index levels between the root and the leaf nodes are collectively known as intermediate levels.
6. Each index row contains a key value and a pointer to either an intermediate level page in the B-tree, or a row in the leaf level of the index.
7. The pages in each level of the index are linked in a doubly-linked list.
8. The page collections for the B-tree are anchored by page pointers in the sys.system_internals_allocation_units system view.

Properties of Clustered Index:
1. The root and intermediate level nodes contain index pages holding index rows.
2 Leaf nodes of clustered index contain the data pages of the underlying table.
3. Clustered indexes have one row in sys.partitions, with index_id = 1 for each partition used by the index.
4. The pages in the data chain and the rows in them are ordered on the value of the clustered index key.
5. All inserts are made at the point where the key value in the inserted row fits in the ordering sequence among existing rows.
6. For a clustered index, the root_page column in sys.system_internals_allocation_units points to the top of the clustered index for a specific partition.
7. SQL Server moves down the index to find the row corresponding to a clustered index key.
8. To find a range of keys, SQL Server moves through the index to find the starting key value in the range and then scans through the data pages   using the previous or next pointers.
9. To find the first page in the chain of data pages, SQL Server follows the leftmost pointers from the root node of the index.
10.If the clustered index is not created with the UNIQUE property, the Database Engine automatically adds a 4-byte uniqueifier column to the table.  When it is required, the Database Engine automatically adds a uniqueifier value to a row to make each key unique.

3_clustered_index_architecture1


Clustered Index Design Guidelines
1. With few exceptions, every table should have a clustered index defined on the column, or columns that offer the following:

1.1 Can be used for frequently used queries.
1.2 Provide a high degree of uniqueness and are accessed sequentially
1.3 Can be used in range queries.
1.4 Used frequently to sort the data retrieved from a table.

2. Before you create clustered indexes, understand how your data will be accessed. Consider using a clustered index for queries that do the following:

2.1 Return a range of values by using operators such as BETWEEN, >, >=, <, and <=.
2.2 Return large result sets.
2.3 Use JOIN clauses; typically these are foreign key columns.
2.4 Use ORDER BY, or GROUP BY clauses.

3 Clustered indexes are not a good choice for the following attributes:

3.1 Columns that undergo frequent changes
3.2 Wide keys
3.3 columns having less unique values

Properties of Non Clustered Index:
1. Nonclustered indexes have the same B-tree structure as clustered indexes, except for the following significant differences:
        1.1 the data rows of the underlying table are not sorted and stored in order based on their
                 nonclustered keys.
        1.2 The leaf layer of a nonclustered index is made up of index pages instead of data pages.
2. Nonclustered indexes can be defined on a table with a clustered index or a heap.
3. Each index row in the nonclustered index contains the nonclustered key value and a row locator.
4. The row locators in nonclustered index rows are either a pointer to a row or are a clustered index key
for a row:

4.1 If the table is a heap, the row locator is a pointer to the row.
      The pointer is built from the file identifier (ID), page number, and number of the row on the
      page.  The whole pointer is known as a Row ID (RID).
4.2 If the table has a clustered index, or the index is on an indexed view, the row locator is the
      clustered index key for the row.

5.  Nonclustered indexes have one row in sys.partitions with index_id >1 for each partition used by the index.
6. For a Nonclustered index, the root_page column in sys.system_internals_allocation_units points to
the top of the non clustered index for a specific partition.
5_4_architecture_of_non_clustered

Nonclustered Index Design Guidelines
Consider using a nonclustered index for queries that have the following attributes:

1. Use JOIN or GROUP BY clauses.
2. Create multiple nonclustered indexes on columns involved in join and grouping operations, and a clustered index on any foreign key columns.
3. Queries that do not return large result sets.
4. Contain columns frequently involved in search conditions of a query, such as WHERE clause, that return exact matches.

How to find different levels of clustered Index pages:

To understand Clustered index architecture, let’s create a simple table named tExample5 and create clustered index on intEmployeeId column

CREATE TABLE tExample5(

intEmployeeId int IDENTITY(10001,1),

strFirstName varchar(100),

strDeptCode int default round(rand()*999999,0),

strAddress varchar(1000),

strSalary int default round(rand()*999999,0))

GO

 

CREATE CLUSTERED INDEX NCI_tExample5_intEmployeeId ON tExample5(intEmployeeId)

GO

 

Let’s insert 50000 rows in tExample5 table to form at least 3 levels of b-tree (Root, Intermediate and Leaf level)

INSERT INTO tExample5(strFirstName, strAddress) VALUES('AAAAA',REPLICATE('A',1000))

GO 50000

Here is the query to list page allocation in each level of clustered index:

SELECT index_depth AS D

, index_level AS L

, record_count AS 'Count'

, page_count AS PgCnt

, avg_page_space_used_in_percent AS 'PgPercentFull'

, min_record_size_in_bytes AS 'MinLen'

, max_record_size_in_bytes AS 'MaxLen'

, avg_record_size_in_bytes AS 'AvgLen'

FROM sys.dm_db_index_physical_stats (DB_ID ('LearningInternals') , OBJECT_ID ('tExample5'), 1, NULL, 'DETAILED');

 

Output:

5_ALlocationDetails1

Observation:
1. Intermediate levels are formed based on length of key field. If Index key is short enough, an Index page will cover large data area resulting better query performance.
2. Based on this observation, it’s recommended to keep index key short for better performance.

Here is the DBCC command to list details of pages allocated to table tExample5

DBCC TRACEON(3604)

DBCC IND('LearningInternals', 'tExample5', -1)

DBCC TRACEOFF(3604)

 

Output (4 out of 7171 pages having different index level are listed):

 

 5_2_page_allocation_in_each_level_of_btree1

 

Observation:

1.       Page usage:

IAM Page

1

Root Page

1

Intermediate pages

26

Data Pages

7143

Total Pages

7171

 

2.       Sample page numbers in each B-Tree level:

PageFID

PagePID

PageType

PageTypeDescription

IndexLevel

IndexLevelDescription

1

256

10

IAM Page

NULL

IAM page.

1

270

2

Index page

2

root level (index page)

1

2247

2

Index page

1

Intermediate level (index page)

1

8474

1

Data page

0

Leaf level (data page)

 

Summary:

In this section, we have seen properties of Clustered and Non Clustered indexes and how to see page allocations in B-Tree structure.

In next section, let’s see how to read different level and types of index pages.

Reference:

Microsoft Press SQL Server 2008 Internals
http://msdn.microsoft.com/en-us/library/ms177443.aspx
http://msdn.microsoft.com/en-us/library/ms190639.aspx
http://msdn.microsoft.com/en-us/library/ms177484.aspx
http://msdn.microsoft.com/en-us/library/ms179325.aspx
 

Comments

Posted by Anonymous on 27 December 2010

Pingback from  Ramkumar (LivingForSqlServer) posts SQL Server Storage Internals Part 5 &#8211; Properties of Clustered and Non Clustered Index pages | sqlmashup

Posted by Mohan Ayyavu on 29 December 2010

too good. thanks.

Posted by GilaMonster on 29 December 2010

One correction:

You said "The pages in the data chain and the rows in them are ordered on the value of the clustered index key. "

Neither is necessarily true. If the index is at all fragmented then the physical order of the pages does not match the logical order. That's the definition of fragmentation.

Rows are no necessarily ordered on a page. There's a slot index at the end of each page that is correctly ordered and that points to the rows on the page. Reason for this is so that SQL doesn't have to move rows around on the page if it's doing an insert of a row that fits in the 'middle'. It can just put the row wherever there's space on the page and then update the slot index to show the correct order.

I also completely disagree with several of your index 'guidelines' but that's a much longer discussion.

Posted by David Russell-366997 on 29 December 2010

excellent explanation.

Posted by Ramkumar on 29 December 2010

Thanks amohanece and David for visiting my blog and your comments.

GilaMonster,

Thanks much for visiting my blog.

1. Reg. your comment on first item "records in clustered index are physically ordered"

Yes you are right. Rows in CI can be in any order. Row Offset Array is to point records in order of key.

I plan to post points like this as part of myths and realities in coming days.

2. Reg your point "I also completely disagree with several of your index 'guidelines' but that's a much longer discussion."

Yes. Performance tuning experience is the right teacher for choosing right indexes.

for technical accuracy I 've taken all these guidelines from MSDN. I 've given all links in reference section.

expecting your frequent visit to my blog and your valuable comments.

Posted by dstinson on 29 December 2010

Ramkumar,

I'm really enjoying this series.  You do a great job of explaining and demonstrating a complicated subject.  Nice work!

One note: In the query to list page allocation of each level of the clustered index, I believe the index_id argument of dm_db_index_physical_stats should be 1, not 2.

Dan

Posted by sqlchaser on 29 December 2010

Excellent and simplistic explanation.  The best I have read regarding indexes, and I have read many...

Posted by Ramkumar on 29 December 2010

Thanks dstinson and sqlchaser for visiting my blog and your comments.

dstinson, thanks for pointing this out.

I updated the query with index_id 1

Posted by Paul Randal on 29 December 2010

A few more corrections:

1) intermediate pages in an index are index pages, not data pages as you say above in your explanation of index level in DBCC IND output.

2) indexes are actually stored as B+ trees, not classic B-trees (I believe I mention that in the indexing chapter of SQL Server 2008 Internals) - but everyone calls them b-trees for simplicity.

Posted by Anonymous on 29 December 2010

Pingback from  Twitter Trackbacks for                 SQL Server Central, SQL Server Storage Internals Part 5 - Properties of Clustered and Non Clustered Index pages - LivingForSqlServer         [sqlservercentral.com]        on Topsy.com

Posted by Ramkumar on 29 December 2010

Paul, Thanks for visiting my blog.

Thanks for pointing out mistake in DBCC IND output. let me update this now. (along with 2 more small mistakes in other pictures).

Reg. your comment on B-Tree, its worth to mention this in Myths and Realities list. I started learning more on this.

Leave a Comment

Please register or log in to leave a comment.