Speeding Up Database Access Part 3 - Fixing Missing Indexes

Question

Speeding Up Database Access Part 3 - Fixing Missing Indexes

mperdeck

SSC Enthusiast

Points: 144
More actions
August 1, 2011 at 9:39 pm

#245386

Comments posted to this topic are about the item Speeding Up Database Access Part 3 - Fixing Missing Indexes

Viewing 11 posts - 1 through 10 (of 10 total)

You must be logged in to reply to this topic. Login to reply

Gail Shaw SSC Guru Points: 1004504 More actions · Answer 1

Some corrections...

The article implies (thought doesn't outright say) that the rows in a table are physically sorted by the clustered index. This is not necessarily true, the order that the rows appear within pages and within the file can differ from the clustered index order. The clustered index enforces a logical sort order and SQL tries as much as possible to make the logical and physical sort order the same when the cluster is created or rebuilt, but it is not guaranteed.

If there is no clustered index, or if it is not unique, then non-clustered index records do have the physical address.

Only if the base table is a heap do the nonclustered index records get the RID (the row identifier). If the table has a clustered index then the nonclustered indexes get the clustering key. If the clustered index is not unique SQL adds a uiniquifier (a 4 byte int) which will also appear in the nonclustered index records.

Putting a Primary Key on a column has the effect of giving it a clustered index.

By default only. It's possible (and in some cases a good idea) to create a nonclustered primary key

When to use an index
Consider columns that have a UNIQUE constraint. Having an index on the column makes it easier for SQL Server to check whether a new value would not be unique.

No need, because unique constraints are enforced by unique indexes. Hence if you add an index to a column that has a unique constraint you're adding a duplicate index, wasting space and wasting performance.

Look at putting an index on at least one column involved in every JOIN. If you join two tables with 500 rows each, this potentially creates a set of 500 * 500 = 250000 rows - so an index on a JOIN can make a big difference.

It only creates such a large resultset if you're doing a cross join, and indexes won't help that. Indexes are recommended on foreign key columns, but not for the reason given here.

Low Specificity
Even if there is an index on a column, the query optimizer won't always use it.

If the index is covering for the query, SQL will use it even if there are only 2 values in a million rows.

The query optimizer is unlikely to use a non-clustered index for a price whose specificity is below 85%.

The figure's closer to 99%, assuming the index in question is not covering.

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

brendan woulfe Hall of Fame Points: 3888 More actions · Answer 2

Gila answered one of my questions about the unique constraint/unique index already.

I do still have a question about the when to use clustered indexes section though. The sentence below makes sense to me.

Most often you'll make the primary key an IDENTITY column, so each new record is assigned a unique, ever increasing number. This means that if you put the clustered index on the primary key, new records are always added at the end of the table without page splits.

My question though is what if you have a table named Customers with a CustomerID primary key. CustomerID is is a uniqueidentifier. I have another table named PhoneNumber with a PhoneNumberID int identity and CustomerID is a FK that references Customers. I know I could create a clustered primary key on the PhoneNumberID integer and a nonclustered index on CustomerID. But, none of my queries/stored procedures reference the PhoneNumberID. They all do something similiar to "WHERE CustomerID = @CustomerID". Would it make more sense to make the PhoneNumberID field a nonclustered primary key to guarantee uniqueness in the table and put the clustered index on the CustomerID even though it's a uniqueidentifier?

Some small testing that I did seems to suggest that the latter seems to perform better than the former but I want to make sure I'm not missing anything. I'm also trying to figure out how it will perform once my application starts to scale out and we go from say a couple hundred customers to a couple hundred thousand customers and more.

Gail Shaw SSC Guru Points: 1004504 More actions · Answer 3

Personally I prefer clustered index to organise the table and nonclustered indexes for data access.

Shameless plugging of my own article... http://www.sqlservercentral.com/articles/Indexing/68563/

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

brendan woulfe Hall of Fame Points: 3888 More actions · Answer 4

brendan woulfe

Hall of Fame

Points: 3888

August 2, 2011 at 9:55 am

#1362292

Thanks for the response. I will give it a read!!

🙂

SQLRNNR SSC Guru Points: 281334 More actions · Answer 5

Well, it looks like Gail already covered my concerns and then some in regards to this article.

Jason...AKA CirqueDeSQLeil
_______________________________________________
I have given a name to my pain...MCM SQL Server, MVP
SQL RNNR
Posting Performance Based Questions - Gail Shaw[/url]
Learn Extended Events

Yuri55 SSCrazy Eights Points: 8449 More actions · Answer 6

Gail,

Just to clarify- if physical order for clustered index in not always the same as logical order (and I read this before) why in this case page split happens while updating/inserting new records? Or am I missing something? Thanks,

Gail Shaw SSC Guru Points: 1004504 More actions · Answer 7

The clustered index enforces which rows of the cluster belong together on the index pages. If a page has the values 1,3, 4 and 5 on it and value 2 is inserted, that value MUST go on that page. It doesn't have to go in any specific location on the page (it can go at the end after 5) but it must be on that page. If there isn't space on that page, the page must be split.

What is not enforced is the order of the index pages within the file (the page containing {10, 11, 12, 13} can appear earlier in the file than the page that contains {1,2,3,4,5} nor the order of rows on the page.

Yes, I need a blog post on this...

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

Yuri55 SSCrazy Eights Points: 8449 More actions · Answer 8

Yuri55

SSCrazy Eights

Points: 8449

August 2, 2011 at 11:51 am

#1362419

Make sense.

Thanks a lot.

ffoerster Right there with Babe Points: 796 More actions · Answer 9

Hi, reading your article I have stumbled across a few more points:

Under "Selecting columns to give an index" you say

Putting a Primary Key on a column has the effect of giving it a clustered index.

A primary is not necessarily a clustered key by its nature.

1 line down:

Putting an index on a table column affects all queries that use that table.

Not necessarily. It is true though that modifications (DML INSERT, DELETE, UPDATE) will be impacted by the indexes on a table.

few lines further down:

A WHERE clause that applies a function to the column value can't use an index on that column

.

It can indeed, however, it will do a scan but no seek. Which means it has to go through all index values for that column to see if the result of the function satisfies the (WHERE) constraint. And it is not guaranteed that such index scan will be the fastest approach to execute the query.

Cheers, Frank

SanDroid SSChampion Points: 10068 More actions · Answer 10

GilaMonster (8/2/2011)
Personally I prefer clustered index to organise the table and nonclustered indexes for data access.
Shameless plugging of my own article... http://www.sqlservercentral.com/articles/Indexing/68563/

No shame in pointing out the truth when something less than that is offered.

Thank you Gail!