SQL Clone
SQLServerCentral is supported by Redgate
 
Log in  ::  Register  ::  Not logged in
 
 
 


Understanding Indexes


Understanding Indexes

Author
Message
Jeff Moden
Jeff Moden
SSC Guru
SSC Guru (203K reputation)SSC Guru (203K reputation)SSC Guru (203K reputation)SSC Guru (203K reputation)SSC Guru (203K reputation)SSC Guru (203K reputation)SSC Guru (203K reputation)SSC Guru (203K reputation)

Group: General Forum Members
Points: 203800 Visits: 41949
Gail and Grant (or anyone else that can jump in),

The article also has the following statements:

...whereas, with a nonclustered index the leaf level contains the address to the data pages contained in the search. Simply put, a clustered index is stored and sorted with the data in the clustered index key order. While the nonclustered index may be thought of as a pointer to the location of the data to which it refers.



I believe, from what you've shown me with DBCC IND, that the bolded statements in the above quote from the article are just not correct. To wit, if they were correct, even a covering index would cause "BookMark Lookups". Since I'm not an expert on indexes that you good folks are, can you confirm that the bolded statements in the above quote are, depending on whether it's a covering index or not, incorrect... or not? I realize that data is stored in the index but that data is actually used if it's a covering index.

Or... perhaps I'm just being a bit too critical of an introductory article.

{edit} And, yes, I realize that either the clustered index or an internal row number (for heaps) is used in an index which may be thought of as an "address" of sorts but the statements in the quote make it sound as if only address information is stored or that the lookup data stored cannot be used directly.

Thanks.

--Jeff Moden

RBAR is pronounced ree-bar and is a Modenism for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column.
If you think its expensive to hire a professional to do the job, wait until you hire an amateur. -- Red Adair

Helpful Links:
How to post code problems
How to post performance problems
Forum FAQs
Jeff Moden
Jeff Moden
SSC Guru
SSC Guru (203K reputation)SSC Guru (203K reputation)SSC Guru (203K reputation)SSC Guru (203K reputation)SSC Guru (203K reputation)SSC Guru (203K reputation)SSC Guru (203K reputation)SSC Guru (203K reputation)

Group: General Forum Members
Points: 203800 Visits: 41949
Grant Fritchey (2/15/2011)
Define a clustered index on the column that appears most frequently in the WHERE clause of SELECT
statements.


But not in JOIN criteria? No, no way. The basic rule for clusters should be the most frequently used access path to the data. This may be primary keys, this may be foreign keys, or it might be simply search criteria, but I wouldn't suggest limiting it to WHERE clauses.


Again, I'm not the "index Ninja" here, but I've also found that a table that suffers huge numbers of inserts can make very good use of a Clustered Index on an auto-numbering column such as an IDENTITY column or, perhaps, a date column to keep page splits to a reasonable level whether or not that column is the most frequently used in WHERE or JOIN criteria.

Have any of you good folks experienced the same or is there some other practice that folks use on a high insertion rate table?

--Jeff Moden

RBAR is pronounced ree-bar and is a Modenism for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column.
If you think its expensive to hire a professional to do the job, wait until you hire an amateur. -- Red Adair

Helpful Links:
How to post code problems
How to post performance problems
Forum FAQs
Peter Trast
Peter Trast
Ten Centuries
Ten Centuries (1K reputation)Ten Centuries (1K reputation)Ten Centuries (1K reputation)Ten Centuries (1K reputation)Ten Centuries (1K reputation)Ten Centuries (1K reputation)Ten Centuries (1K reputation)Ten Centuries (1K reputation)

Group: General Forum Members
Points: 1024 Visits: 655
"Select columns that are unique such as an IDENTITY column or contain many distinct values." ???

Really? Does the fact that a column is an identity column really qualify it for an index if the identity is never used in a query? This is an honest question, NOT sarcasm. I certainly have a lot to learn about indexes still. But I ask this because I recently spent some time with a customer that was looking to upgrade his storage because of what he perceived were disk contention issues. I came in with an eye to look at tuning the SQL environment before recommending any hardware or software upgrades and I found that he had followed the advice of a SQL blog that recommended making all of his clustered indexes on the primary keys (which all happened to be IDENTITY columns), none of which were ever mentioned in queries to the database, which I know might be unusual, but this data was only used for 2 import processes into what was essentially a mostly read environment (I know that is not normal and we are discussing warehousing) which never called these specific ID columns after import.

The point I am making is that we need to be careful about the reasons we choose columns to index on and avoid focusing on a SINGLE criteria, like "it's an IDENTITY column". These statements should have caveats and qualifiers, like "index on an IDENTITY column or unique column IF you use it as part of a query. If you never or rarely query against a column, you may never see a benefit, maybe only add overhead for no good reason". The result of removing these indexes from these ID columns and only adding a couple of indexes on data that was actually queried resulted in major performance improvements and refocused the equipment purchase.

And a statement like ...
"Conceptually, one can consider an index to be a data structure containing pairs ((a1,...,an), b), where (a1,c,an) is the tuple of values for the attributes A1,c,An and b is the address of the tuples in r with these values, that is, a pointer to the set ,c, " (Mannila & Raiha, 1992)"
...is not very helpful in a level 100 post for "a junior DBA". Smile

And if you want to describe the use of B-trees, maybe explaining what a B-tree is would be helpful. Smile Even though most of us who post and read here may have a strong grasp, remember the intended audience.

But thanks for bringing it up. Indexes are a very important part of what we do BUT let me add, STATISTICS maybe more important. For level 300 or 400 people you might enjoy this from the MCM series:
http://technet.microsoft.com/en-us/sqlserver/ff977043.aspx

I highly recommend even junior DBA's review some of these videos. They give insight into some topics that may be a little too advanced, but anyone of any level can get some good info.

Thanks for the post Ron! BigGrin

Peter Trast
Microsoft Certified ...(insert many literal strings here)
Microsoft Design Architect with Alexander Open Systems
Grant Fritchey
Grant Fritchey
SSC Guru
SSC Guru (93K reputation)SSC Guru (93K reputation)SSC Guru (93K reputation)SSC Guru (93K reputation)SSC Guru (93K reputation)SSC Guru (93K reputation)SSC Guru (93K reputation)SSC Guru (93K reputation)

Group: General Forum Members
Points: 93773 Visits: 33010
Jeff Moden (2/15/2011)
Grant Fritchey (2/15/2011)
Define a clustered index on the column that appears most frequently in the WHERE clause of SELECT
statements.


But not in JOIN criteria? No, no way. The basic rule for clusters should be the most frequently used access path to the data. This may be primary keys, this may be foreign keys, or it might be simply search criteria, but I wouldn't suggest limiting it to WHERE clauses.


Again, I'm not the "index Ninja" here, but I've also found that a table that suffers huge numbers of inserts can make very good use of a Clustered Index on an auto-numbering column such as an IDENTITY column or, perhaps, a date column to keep page splits to a reasonable level whether or not that column is the most frequently used in WHERE or JOIN criteria.

Have any of you good folks experienced the same or is there some other practice that folks use on a high insertion rate table?


I'm not an "index Niinja" either. I leave that to Gail, but in my experience and with my understanding, you're 100% right on here. Again, I like my own definition of "most frequently used access path" because that can apply to inserts as well as updates, deletes or selects. It doesn't matter. Since the cluster holds the data, you need to take advantage of that fact in whatever way is most advantageous to the system you're working on.

----------------------------------------------------
The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood...
Theodore Roosevelt

The Scary DBA
Author of: SQL Server Query Performance Tuning and SQL Server Execution Plans
Product Evangelist for Red Gate Software
Grant Fritchey
Grant Fritchey
SSC Guru
SSC Guru (93K reputation)SSC Guru (93K reputation)SSC Guru (93K reputation)SSC Guru (93K reputation)SSC Guru (93K reputation)SSC Guru (93K reputation)SSC Guru (93K reputation)SSC Guru (93K reputation)

Group: General Forum Members
Points: 93773 Visits: 33010
Jeff Moden (2/15/2011)
Gail and Grant (or anyone else that can jump in),

The article also has the following statements:

...whereas, with a nonclustered index the leaf level contains the address to the data pages contained in the search. Simply put, a clustered index is stored and sorted with the data in the clustered index key order. While the nonclustered index may be thought of as a pointer to the location of the data to which it refers.



I believe, from what you've shown me with DBCC IND, that the bolded statements in the above quote from the article are just not correct. To wit, if they were correct, even a covering index would cause "BookMark Lookups". Since I'm not an expert on indexes that you good folks are, can you confirm that the bolded statements in the above quote are, depending on whether it's a covering index or not, incorrect... or not? I realize that data is stored in the index but that data is actually used if it's a covering index.

Or... perhaps I'm just being a bit too critical of an introductory article.

{edit} And, yes, I realize that either the clustered index or an internal row number (for heaps) is used in an index which may be thought of as an "address" of sorts but the statements in the quote make it sound as if only address information is stored or that the lookup data stored cannot be used directly.

Thanks.


I think what he means there is that the rest of the data, that which is not a part of the nonclustered key or added by the INCLUDE statement, is stored at the clustered index and a pointer back to the cluster is stored at the leaf of the nonclustered index. I'm pretty sure that's what he meant, but the way it's written, I think it is subject to the interpretation you're giving it. He's basically right on this one, but it's phrased a bit awkwardly. At least that's my opinion on it.

----------------------------------------------------
The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood...
Theodore Roosevelt

The Scary DBA
Author of: SQL Server Query Performance Tuning and SQL Server Execution Plans
Product Evangelist for Red Gate Software
Kiara
Kiara
Right there with Babe
Right there with Babe (791 reputation)Right there with Babe (791 reputation)Right there with Babe (791 reputation)Right there with Babe (791 reputation)Right there with Babe (791 reputation)Right there with Babe (791 reputation)Right there with Babe (791 reputation)Right there with Babe (791 reputation)

Group: General Forum Members
Points: 791 Visits: 1426
Jeff Moden (2/15/2011)


Or... perhaps I'm just being a bit too critical of an introductory article.


I don't think you are.

I've seen plenty of "introduction to indexing" articles that are far easier to read and introduce the concepts behind indexes in a more approachable manner. If this was intended as "Index Theory 101", it seems to include some good basic information, but presented in a way that makes it seem far more complicated than it actually is - particularly when we read the caveats about what is out of scope and what is being deferred to other discussions.

If it's intended as "Practical Index Basics 101 for SQL Server", well, it appears to also miss the mark for its intended audience.

I guess that's why I'm so curious about the original reason this article was written and who the target audience was.

I'd rather refer someone new to the topic to Gail's series that starts here: http://www.sqlservercentral.com/articles/Indexing/68439/.

Ah, well. I've probably devoted too much time today to puzzling over this particular article.

-Ki
Charles Kincaid
Charles Kincaid
Hall of Fame
Hall of Fame (3.2K reputation)Hall of Fame (3.2K reputation)Hall of Fame (3.2K reputation)Hall of Fame (3.2K reputation)Hall of Fame (3.2K reputation)Hall of Fame (3.2K reputation)Hall of Fame (3.2K reputation)Hall of Fame (3.2K reputation)

Group: General Forum Members
Points: 3167 Visits: 2384
First I think that the intent, target, and scope are great. Yes, there is room for revision.

Nice effort, Ron. Everybody gets the blowtorch to the backside for everything posted here. I remember having to suck it up and revise. Welcome to the team.

ATBCharles Kincaid
zebulonpi
zebulonpi
Grasshopper
Grasshopper (11 reputation)Grasshopper (11 reputation)Grasshopper (11 reputation)Grasshopper (11 reputation)Grasshopper (11 reputation)Grasshopper (11 reputation)Grasshopper (11 reputation)Grasshopper (11 reputation)

Group: General Forum Members
Points: 11 Visits: 34
I think the criticism is less "taking a blowtorch to his backside" and more commenting about how, for an article that clearly states its intended audience is "the junior database administrator (DBA) or others with little or no experience in the design, administration, or optimization of database systems", some of the errors in the article may be taken as correct facts.

A lot of these articles are "fire and forget", in terms of people reading them, so someone new to SQL and indexing may well read this, say "OK, I understand now", and never come back to see any revisions, hence what I see as a concern.

Also, when publishing something intended as educational, AND being reviewed by your peers, you WILL get criticism. Take it as constructive, learn from it, and become better.
Charles Kincaid
Charles Kincaid
Hall of Fame
Hall of Fame (3.2K reputation)Hall of Fame (3.2K reputation)Hall of Fame (3.2K reputation)Hall of Fame (3.2K reputation)Hall of Fame (3.2K reputation)Hall of Fame (3.2K reputation)Hall of Fame (3.2K reputation)Hall of Fame (3.2K reputation)

Group: General Forum Members
Points: 3167 Visits: 2384
zebulonpi (2/15/2011)
Also, when publishing something intended as educational, AND being reviewed by your peers, you WILL get criticism. Take it as constructive, learn from it, and become better.


Which was exactly my point. If there was no criticism then either it was not read or nobody cared as human effort is not perfect. Still the "blowtorch to the backside" was about the cleanest way that I could think of to state how this feels.

ATBCharles Kincaid
roger_os
roger_os
SSC-Enthusiastic
SSC-Enthusiastic (191 reputation)SSC-Enthusiastic (191 reputation)SSC-Enthusiastic (191 reputation)SSC-Enthusiastic (191 reputation)SSC-Enthusiastic (191 reputation)SSC-Enthusiastic (191 reputation)SSC-Enthusiastic (191 reputation)SSC-Enthusiastic (191 reputation)

Group: General Forum Members
Points: 191 Visits: 568
I enjoy articles like this because I learn so much from the responses!
Go


Permissions

You can't post new topics.
You can't post topic replies.
You can't post new polls.
You can't post replies to polls.
You can't edit your own topics.
You can't delete your own topics.
You can't edit other topics.
You can't delete other topics.
You can't edit your own posts.
You can't edit other posts.
You can't delete your own posts.
You can't delete other posts.
You can't post events.
You can't edit your own events.
You can't edit other events.
You can't delete your own events.
You can't delete other events.
You can't send private messages.
You can't send emails.
You can read topics.
You can't vote in polls.
You can't upload attachments.
You can download attachments.
You can't post HTML code.
You can't edit HTML code.
You can't post IFCode.
You can't post JavaScript.
You can post emoticons.
You can't post or upload images.

Select a forum

































































































































































SQLServerCentral


Search