Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 

The Voice of the DBA

Steve Jones is the editor of SQLServerCentral.com and visits a wide variety of data related topics in his daily editorial. Steve has spent years working as a DBA and general purpose Windows administrator, primarily working with SQL Server since it was ported from Sybase in 1990. You can follow Steve on Twitter at twitter.com/way0utwest

Use a Clustered Index

I stumbled across this clustered index test from Hugo Kornelis recently from a post in the forums. I think I’d read it before, but it was surprising to me again this week as I read through it.

The test seems to show that if you are changing data over time with UPDATEs on a table, which is a heap, then you can not only fragment the table, but you can dramatically impact performance. His test, which seems reasonable, shows that you can end up with 10 times the rows being read.

I’m not sure that this is representative and your mileage may vary. You could end up with more or less scans, but what if you had double the number of rows read? That’s still a substantial impact to performance in your application. You might not notice if it you don’t run queries that force table scans often, but based on the quality of code I see people asking about on the forums, that’s not likely.

Instead, add a clustered index. The original post was from someone worried about fragmentation on a series of tables for a third party application and they couldn’t “change anything”. To me that’s dumb. Adding a clustered index, in the hope of improving performance, isn’t really a big change. And it can easily be undone if it worsens performance in that application.

It’s not likely, but it could happen. However a third party that doesn’t want to allow a clustered index to be added to tables in their application because it might cause issues is a seriously ignorant company on how a clustered index works.

Comments

Posted by Jason Brimhall on 7 April 2010

I have an application very similar to that here.  We are not supposed to make any changes because future updates could delete the indexes or procs that we might add.

I don't like that kind of setup - but is too often something that we must deal with.

Posted by Seth Phelabaum on 7 April 2010

Right there with you Jason.  I'm in a similar environment, and while adding new stuff isn't really a problem, but changing anything existing can be overwritten.  Basing a bunch of stuff off of those base tables is also problematic, because they can change at any given time.

That said, we have a good relationship with the vendor and they'll push most of my changes into their source control, so it's not as bad as it could be.

Posted by Neil Laslett on 11 April 2010

We're in the same boat - our application vendor has provided a database that uses massive heap tables throughout.  I guess they figured a heap table was better than a clustered index on the GUID PK.  I could drop a clustered index on their GUID but I'm aware of the issues there.

So my question: What's the danger in adding a new integer identity column and making it the clustered index?  Worst-case scenario, it gets dropped in an upgrade and we're back to where we started.  Or are there worse potential consequences?  (Other than voiding our maintenance plan by hacking their core tables.)

Posted by david.r.buckingham on 12 April 2010

nlaslett,

The worst-case scenario is that their code can't handle the additional column and would cease to function.  Personally, I'd probably look for a suitable date column, or in lieu of that another column that range searches occur on (between, greater than, less than, etc.), and try adding a clustered index to that column.  It wouldn't change functionality, but could negatively impact performance so of course test it thoroughly before introducing it into production.

Posted by Steve Jones on 12 April 2010

I'd be careful about adding a new column. There are likely unqualified INSERT statements (no columns listed) or other SELECT * type issues. I would agree with David that searching for another column might be a better idea.

Leave a Comment

Please register or log in to leave a comment.