Blog Post

SQL Server : Part 10: Importance of Key Column Position While Creating Index

,

Now we have discussed about different types of indexes in the last posts. In this post, let us discuss about the key column order (order of the column of indexes). The order of the key column of the index is decided based on the data access pattern and how do you want to organize the data.

The general guidelines for the order of the index key column is to keep the most selective column as the first column.It does not  meant that, a unique id column should be first column of all your index. The optimizer will decide to use the index based on the statistics available on the index. I will explain about the statistics in later post. Statistics gives the information about the density of the key column which give uniqueness of index, and histogram that stores the information about the distribution of the values within the column.

Let us consider an example of customer table which stores the information of customers from across the countries.The application running on top of this table deal with customers from a specific country based on the user permission/access location. 

CREATE customer (

   Customer_id     INT IDENTITY(1,1) NOT NULL
  
CountryCode     CHAR(3) NOT NULL,
  
FirstName       VARCHAR(100) NOT NULL,
  
LastName        VARCHAR(100) NOT NULL,
   Mobile
Phone     VARCHAR(20),
  
Email           VARCHAR(100)
  
)

GO
CREATE UNIQUE CLUSTERED INDEX Ix_Customerid_Countrycode ON customer(Customer_id,Countrycode)


The clustered index is created based on the general guideline to keep the most selective column on the left side.If I need fetch a single records based on the customer_id, this index will work perfectly.So what is the drawback of this index ? In case if I need to fetch all/many customers based on the countrycode , the optimizer opt for clustered index scan.


SET STATISTICS IO ON 
go

SELECT * FROM customer WHERE Countrycode='VNH' AND customer_id=1216468












Let us try to fetch all customers with countrycode VNH. The table has around 620 thousand records and there are 3066 customers with VNH countrycode

SELECT FROM customer WHERE Countrycode='VNH' 











From the execution plan, it is clear that, optimizer has opted for clustered index scan  by scanning all 6825 pages used to store this table. We can optimize this by changing the index with countrycode as the first column.

DROP INDEX customer.Ix_CustomerId_CountryCode 
GO
CREATE UNIQUE CLUSTERED INDEX Ix_CountryCode_CustomerId ON customer(Countrycode,Customer_id)


SET STATISTICS IO ON 
go

SELECT FROM customer WHERE Countrycode='VNH' AND customer_id=1216468









Let us try to fetch all customers with countrycode VNH. 


SELECT FROM customer WHERE Countrycode='VNH' 










From the execution plan, it is clear that , the optimizer used index seek in both the cases and IO operation has reduced drastically in the scenario while fetching all the customers with VNH countrycode.

Other problem will be, while keeping the customer_id as the first column , data will be stored in the order of customer_id column and you will have many pages (almost all pages) will have data belongs to multiple countrycode. This may lead to more blocking/deadlock issues.By defining the index with countrycode as the first column, only a few pages will have data overlapped with multiple countrycode and will help to reduce the blocking issues.The important point is, by defining the index with countrycode as the first column will cause for higher level of index fragmentation, but that can be controlled by defining proper fill factor value. I have experienced this in one of our project and experienced lot of improvement after changing the indexes with countrycode as the first column.

In short the general guidelines about the key column is a good starting point but at the same time you have to consider the data access pattern in your application.Hope this will help you to resolve some issue that you are facing.

If you liked this post, do like my page on FaceBook

Rate

You rated this post out of 5. Change rating

Share

Share

Rate

You rated this post out of 5. Change rating