QOD 11/21

  • Solution IMHO is wrong becouse we loose unique constraint on customer ID.

    What do you think about it?

  • Loosing unique constraint by doing that is right, but the question was:

    quote:


    Which of the following could be done to improve performance?


    It did not say anything about other impacts of the change...

    Edited by - hanslindgren on 11/21/2003 03:13:00 AM

  • Oh dear. Where do you start with this one. Well let's try this: To create a primary key clustered index on the LAST_ORDER_DT column would require each value in the LAST_ORDER_DT column to be unique and I would suggest that this is not going to be likely. Or is there something about primary keys that I misunderstand?

  • The "correct answer" of course improves performance but raises many problems :

    1. uniqueness of customer_id is lost

    2. last_order_dt should not be unique

    3. the clustered index would be updated on each update of last_order_dt, which will decrease overall performance.

  • A tad unfair I feel. As DBAs we are expected to foresee consequences of actions and just dropping a clustered index to create it on another, more volatile field, would definitle raise an eyebrow or two from my coleagues. A very unlikely real-world solution, especially as there was a more likely real-world (not Cuckoo land) solution in the form of the non-clustered index.

  • I agree with most of the voters at this time... the non-clustered index option is not only a viable solution, but is in fact probably preferable to changing the existing clustered index (even assuming you'd ensure a PRIMARY KEY constraint / non-clustered index would be left on the CustomerID column).

    This is because ADDING an index will almost certainly not reduce performance of any existing queries, whereas CHANGING the clustered index for a table is a decision not lightly entered into.

    PS: Changing the clustered index would be the preferable solution if the question had stated that there were no other queries against the table (but that does beg the question of how the data gets in there in the first place...)

    --

    Si Chan

    Database Administrator

  • Only a clustered index will solve this problem. You can only have one of these so you must drop the existing cluster.

    Hands up all thise who didn't spot the change in the answers between the email and the web site? The answer was changed from dropping the primary key to just dropping the cluster!

  • Index questions always cause such interesting debates. The only way to correct his range query with the order by was to move the clustered index (since there can only be one) to the column that's being used in the ORDER BY clause. A non-clustered index on that column will cause SQL Server to have to do an index lookup for each record that's greater than the desired date, then do a sort on that after it retrives the collection of records.

    Of course, much research must go into making an index decision, which should be assumed. That's one of the reasons I hoped to remove some of that controversy by saying that most of the queries run against the system looked like that query. At that point, you're fixing the largest system problem, even though you may create some minor performance problems with other queries.

    Regardless, unless you're periodically ordering on the primary key, which most don't in an identity type environment, the clustered index there doesn't do you a whole lot of good. It's best served as a non-clustered index for foreign key lookups since it has a high degree of selectivity.

    Fun discussion though and always gets to a religious debate. I learn something from it everytime.

    Brian Knight

    bknight@sqlservercentral.com

    http://www.sqlservercentral.com/columnists/bknight

  • Dammit, once again! So creating a non-clustered index on the date column is wrong... sorry, I can see no advantage in creating the clustered index on the date column instead of leaving it where it is. In my eyes, creating a non-clustered index will give anyone enough performance to say the answer is correct (yes we DID improve the performance, didn't we???) - You don't make the questions harder (and the ones feel better who chose right) by giving one 100% and one 99,99999% solution, it's just going to be ridiculous.

  • Get real!

    You never (say again NEVER) want to create the clustered index on a volatile column. Every time the customer had a new order, SQL Server would have to move the record within the clustered index. Plus all non-clustered indexes would have to be updated for this change as well.

    Your system administrator would kick you a-- for burning up his hard drives. Your users would hang you for the lousy performance. After this, your boss would fire you. All this would happen before lunch.

    Larry Ansley

    Atlanta, GA


    Larry Ansley
    Atlanta, GA

  • Very interesting. I'm glad I'm not the only one that raised an eyebrow at the "correct answer." So no preformance gain would be had with the non-clustered index?

  • You are correct that you wouldn't normally want to have clustering on a volatile column. However, this doesn't seem to be that volatile and there are no other indices to consider!

    Having a non clustered index on the date column will NOT HELP with this query. SQL probably wouldn't even use the index if you created it!

  • BanzaiSi:

    quote:


    This is because ADDING an index will almost certainly not reduce performance of any existing queries


    Have you ever heard of queries that does INSERTs? Will they always be faster with an additional index? Add a hundred indicis, add a record and be happy with your not-degraded performance 😉

    lansley: Why do you assume that the column LAST_ORDER_DT is volatile? You never (say again NEVER) assume anything 😉 What if it never changes? How do you know?? Based on the name of the column? Does that give you enough information to say what you are saying? You have no idea what the column is intended for and you should not try to guess.

    Thanx for the Religous debate 😉 I am of course agnostic to any hard-core fanatics out there...

    Regards, Hans!

  • Why in the world would you put a clustered index on a field that is likely to change EVERY time a new order is placed?

    Adding the non-clustered index will solve the performance problem, and not add new performance problems.

    Student of SQL and Golf, Master of Neither


    Student of SQL and Golf, Master of Neither

  • quote:


    lansley: Why do you assume that the column LAST_ORDER_DT is volatile? You never (say again NEVER) assume anything 😉 What if it never changes? How do you know?? Based on the name of the column? Does that give you enough information to say what you are saying? You have no idea what the column is intended for and you should not try to guess.


    What is it about LAST_ORDER_DT that you don't understand? If it doesn't change, the company doesn't have any revenue and goes bankrupt. In many companies, the same customers order every day.

    quote:


    SQL probably wouldn't even use the index if you created it!


    Actually, SQL Server uses the clustered index EVERY time, even when it also uses a non-clustered index. The clustered index is where the data is. But I know what you meant. It might scan the clustered index, instead of performing a seek.

    In most customer tables, there are customer numbers and customer names. The customer number (almost certainly unique) or an identity column are the best candidates for the clustered index. Even a Heap with only non-clustered indexes is a better solution than clustering a volatile date column.

    Larry Ansley

    Atlanta, GA


    Larry Ansley
    Atlanta, GA

Viewing 15 posts - 1 through 15 (of 54 total)

You must be logged in to reply to this topic. Login to reply