RE: How to optimize correctly 20 indexes ?

SSC Guru

Points: 1003861

April 10, 2014 at 9:54 am

Working from the bottom up on your response...

Understood on the "Divide by one" thing. Parameterising that and using "1" for that particlular situation makes your code a lot easier that possibly having multiple decisions as to which formula to use if the value of "1" is used infrequently and you don't mind burning the occasional unnecessary clock cycles. Well done there! 🙂

On the recommendations...

1) drop the ID column ?

"It Depends". I'm right there with you on using an IDENTITY column to make an easy unique reference to any given row even if there's an alternate unique key based on natural values especially when such an AK is a composite. If you decide to keep it (and I probably would based on what I just said), I wouldn't waste the clustered index on this mostly static table.

2) set the PK (clustered index) on YPID, CellX, CellY and CellZ ?

Just to be clear, the PK doesn't have to be the clustered index. That's just the default for when you don't specify and the clustered index hasn't been created yet.

Also, yes, the clustered index should be moved to other columns but maybe not in the order that you have them. If you look at parameter {1} and {3} and {9} in the WHERE clause, they're column names and I would put the order of those columns in the clustered index from left to right in the most common order of selectivity with the most selective being on the left and the least selective being on the right.

According to your original WHERE clause that you posted...

WHERE CellX BETWEEN 1 AND 140 AND CellZ >= -2 AND CellZ < 14 AND YPID = 'PYpa_6'

... the CellY column didn't even enter the mix. I only included CellY in my recommendation because I think it will be necessary to make each row in the index unique. So far as I'm concerned, if a clustered index isn't unique and because all of the other indexes contain the columns of the clustered index behind the scenes, having a non-unique clustered index can cause a world of hurt insofar as performance and disk storage goes.

Based on the original WHERE clause above, my recommendation for the clustered index would be CellX, CellZ, and YPID. If that's not a unique combination, then add CellY to the end of that. If that still doesn't make it unique, then post back and I'll tell you how to make it unique.

Shifting gears to your first question, your PvQlt columns are what make the table denormalized. They can have NULL values according to your table definition and you can't search for a quality value unless you also know the quality column that the value might be in. That also means that if you have to do searches for quality values, you might have to index all 20 of those columns to get the required performance out of your columns. While it looks like it's more efficient storage wise as you have the table now, remember that indexes are basically a sorted duplication of data and that the clustered index will also be added to each non-clustered index. Having an index on each PvQlt column would possibly consume much more than if each PvQlt value were on a single row even if all the other non-PvQlt values needed to be repeated. Of course, that would also be a denormalization but it would make searches for PvQlt values a lot easier and faster. To bring this table to 3rd normal form, you'd have to split this table up into other tables quite a bit and that might be serious overkill but, again depending on what your searches will be, normalizing the PvQlt columns in a separate table using the IDENTITY column as the join value might be very worth while. It might help performance a bit, as well, because the base table would be able to fit more rows per page in the clustered index (which is actually the table data at the leaf level).

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)