Choosing columns for Clustered Index.. Why it is so important?

Question

Choosing columns for Clustered Index.. Why it is so important?

Brahmanand Shukla

SSC Eights!

Points: 990
More actions
October 6, 2016 at 5:50 am

#330331

I wanted to write an article on Choosing the right columns for Clustered Index and Why it is so important? This becomes extremely important for an intensive read/write DB.
It will cover :-
1. What is Clustered Index?
2. Why to always have Clustered Index? What can happen if we will not have it?
3. Various locking in absence of Clustered Index
4. Hard Delete vs Soft Delete... Which to choose and why?
5. (Key lookup + Covering Indexes) vs (Clustered Index)
6. Requirement vs Best practices in choosing the right column for the Clustered Index
Please share your feedback and let me know if i go ahead with this?
SQL Server Carpenter - Consultant & Trainer
Author of Learn T-SQL from Scratch

Viewing 13 posts - 1 through 13 (of 13 total)

You must be logged in to reply to this topic. Login to reply

Eric M Russell SSC Guru Points: 125627 More actions · Answer 1

I'd also mention the issue of page splitting, fragmentation, fill factor, and how it all relates to choosing the optimal clustering key combination.

https://sqlperformance.com/2015/04/sql-indexes/mitigating-index-fragmentation

Imagine you're stacking books on a shelf sorted by author and you leave little or no room in between. Then one day you receive a shipment of 100 books all by the author Nora Roberts. What must you do to make room while maintaining the correct order? You must shuffle other books to new locations. In terms of row store tables, page splitting is the equivalent, and it can result in significant I/O overhead.

"Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho

Luis Cazares SSC Guru Points: 183706 More actions · Answer 2

The subject is interesting, but you're trying to cover a wide set of topics. Try to get a series of articles instead of just one. At least, that's my opinion.

Luis C.
General Disclaimer:
Are you seriously taking the advice and code from someone from the internet without testing it? Do you at least understand it? Or can it easily kill your server?

How to post data/code on a forum to get the best help: Option 1 / Option 2

Brahmanand Shukla SSC Eights! Points: 990 More actions · Answer 3

Yeah.. You are right:-)

Will break it into multiple articles and will try to link them.

Thanks for your feedback... I am new to Public forum where I will write something. I thought to share my experience and learning that may be helpful for others.

Thank you once again!!

Can I go ahead and initiate it??

SQL Server Carpenter - Consultant & Trainer
Author of Learn T-SQL from Scratch

Abhishek SSCommitted Points: 1716 More actions · Answer 4

Abhishek

SSCommitted

Points: 1716

September 27, 2018 at 2:03 am

#2007196

who is stopping you ? Go ahead .

Brahmanand Shukla SSC Eights! Points: 990 More actions · Answer 5

I have written my own blog
https://brahmanand.tech.blog

SQL Server Carpenter - Consultant & Trainer
Author of Learn T-SQL from Scratch

Lynn Pettis SSC Guru Points: 442462 More actions · Answer 6

Brahmanand Shukla - Thursday, September 27, 2018 9:01 AM
I have written my own blog
https://brahmanand.tech.blog

I will disagree with #6 Object, Column and Variable name should be in Title case.

You should code these as they are declared in the database. If the column name is declare as 'object_id' then you should code it as 'object_id' (check the system tables). My rule of thumb, code as if you are working in a database or instance where the collation is case sensitive, even if it isn't. You never know when your code may get used in such an environment. This is just a defense coding method I have chosen to use and it has saved my butt numerous times.

Lynn Pettis SSC Guru Points: 442462 More actions · Answer 7

Brahmanand Shukla - Thursday, September 27, 2018 9:01 AM
I have written my own blog
https://brahmanand.tech.blog

I also disagree with #17 Tables whose columns are not used in query should not be there in joins.

I have used columns in joins even if they aren't used else where in the query. The JOIN may be the only place columns may be used in a query depending on the data and the tables.

Lynn Pettis SSC Guru Points: 442462 More actions · Answer 8

Brahmanand Shukla - Thursday, September 27, 2018 9:01 AM
I have written my own blog
https://brahmanand.tech.blog

And yes, I also disagree with #21 Donâ€™t use CURSOR. Use WHILE loop in place of cursor.

Yes, avoid cursors where possible. But where appropriate a fire hose cursor can perform better than WHILE loop with temporary tables that require maintenance to support the loop. Plus, even a cursor requires a WHILE loop.

Lynn Pettis SSC Guru Points: 442462 More actions · Answer 9

Brahmanand Shukla - Thursday, September 27, 2018 9:01 AM
I have written my own blog
https://brahmanand.tech.blog

And #29 Avoid Dynamic SQL.

Dynamic SQL is a tool. If used appropriately it is good, but used inappropriately it is evil. There are times that using dynamic SQL is necessary. When using dynamic SQL be sure to code defensively to avoid SQL injection. Use EXEC sp_executesql so you can also send appropriate data as variables to dynamic SQL where this makes sense. Also TVP make sense in this case as well to send multiple values where needed.

Lynn Pettis SSC Guru Points: 442462 More actions · Answer 10

Brahmanand Shukla - Thursday, September 27, 2018 9:01 AM
I have written my own blog
https://brahmanand.tech.blog

Reread number 1 and it is an agree/strongly disagree: SET NOCOUNT ON and TRANSATION ISOLATION LEVEL READ UNCOMMITED should be there at the beginning of Stored Procedure.

You do know the problems that you can experience when using transaction isolation level READ UNCOMMITTED don't you? This allows dirty reads, phantom reads. If the data MUST BE RIGHT you don't want this.

dmbaker SSCertifiable Points: 5144 More actions · Answer 11

I kinda disagree with #27 and 30, though I guess it depends on what version of SQL Server we're talking about. I wouldn't use @@ERROR, I'd use ERROR_NUMBER() and the other ERROR* functions instead. Plus I'd wrap things in a BEGIN TRY/BEGIN CATCH and deal with error and pending transaction in the CATCH (oh that reminds me...you don't mention how to check for/deal with deal with an uncommitable transaction)..

In #30, what does "properly return" mean? Output parameters? A custom error message? Why bother? The client will receive the error message (unless you swallow it in a CATCH, which you probably shouldn't do...I generally re-THROW the original error).

Maybe you should qualify this as being best practices for you. 🙂

Eitan Blumin SSC Enthusiast Points: 172 More actions · Answer 12

Hey, it's been a long time since anyone responded on this thread, but in case anyone is interested, I've recently published a blog post dealing with this issue of clustered indexes:

https://eitanblumin.com/2019/12/30/resolving-tables-without-clustered-indexes-heaps/