Never, Ever Use Clustered Indexes

  • Comments posted to this topic are about the item Never, Ever Use Clustered Indexes

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • [font="Verdana"]So.... Never, Ever Use clustered indexes ... while working with oracle 😀

    just kidding !!! :hehe:.

    [/font]

  • Fun post 🙂

    Very few system ends up chaining their database to another database. I believe in the agile way and to plan for a risk of changing db is something you should not do. That is a problem you take if it comes.

  • "every table within SQL Server should have a clustered index"

    Fine, but should it then also always be the (preferably meaningless?) primary key? Or an index that is more likely to be used to retrieve ranges of records?

  • The problem of possibly changing databases should be catered for in the actual application code by using a data access layer that can be changed according to the database used, not in the database.

    The "black box" data access layer will always take the same inputs and queries from the app and communicate appropriately with the relevant database.

    Visual Studio development makes this easy despite being a Microsoft product!

  • Alex-668179 (5/31/2012)


    "every table within SQL Server should have a clustered index"

    Fine, but should it then also always be the (preferably meaningless?) primary key? Or an index that is more likely to be used to retrieve ranges of records?

    Ah, but there's the rub. Honest people disagree. But there's little disagreement that there are radical differences in the methods of storage between a clustered index and a heap. In the overriding majority of the time, use a clustered index.

    Also, I wouldn't focus on that "range of records" argument either. I'm not sure where that came from originally, but it's long been considered bunk. I usually focus on making my clustered index the most frequent access path to the data in a given table. That's frequently the primary key, but not always.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • P Jones (5/31/2012)


    The problem of possibly changing databases should be catered for in the actual application code by using a data access layer that can be changed according to the database used, not in the database.

    The "black box" data access layer will always take the same inputs and queries from the app and communicate appropriately with the relevant database.

    Visual Studio development makes this easy despite being a Microsoft product!

    But if I'm understanding the point of the post while you can do some of this type of abstraction in a data access layer you still need to have the database optimised and as far as I know (which maybe isn't that much :-D) the only way to do this is at the database level and will therefore be different for different database products.

    Well that is unless you plan to load the entire DB into application memory and then I guess the data access layer could be responsible for all performance issues.

  • In fact, I’d argue that you need to design precisely for specific relational database management systems because, let’s face it, they don’t implement the fundamentals in the same way. If you mess up the fundamentals, you’ve just messed up your entire design.

    I love this article (Seriously. Well done, Grant.). It describes in yet another way how true portability is nothing more than a myth and a wish.

    I've always said that NOT using the extremely powerful proprietary extensions of either SQL Server or Oracle (to name just a couple of RDBMSs) is like not using anything but the 4 basic functions on a scientific calculator because some people won't have anything but a 4 function calculator. Then, even after making the mistake of using only those proverbial 4 functions, they find out they're moving from a scientific calculator to one that uses reverse notation and even the 4 functions don't work the same way.

    Oh, wait. No... Sorry. I forgot... we can use ORMs and things like Visual Studio where even a neophyte can make everthing portable without any hits on performance. Well, except for most batch code... and some front end code... Ok... most front end code especially when you have to trick the defaults into not using NVARCHAR for everything that's character based. Yep... gotta love things like Linq.

    At least we don't have to worry about how to write parameterized stored procedures to be portable. We're lucky that everyone got together and decided how to declare variables and what the datatypes will be, huh? Orms can take care of any shortcomings there, too, I'm sure.

    And lets talk about the true backbone of all companies... I'm certainly happy the we don't have a portability problem with differences in Reporting code and other BI code. We can just skip using what's built in and buy some 3rd party software that works on everything, right? Where it doesn't, an Excel spreadsheet will always do the job especially if we export all of our data as XML, right? What the heck... avoid the middle man.

    Let's really standardize and store all of our data in a nice simple XML only table. Then we won't even have portability problems with ETL or other communications. After all, it was built with true portability in mind. You don't even have to know what's in the files because it automatically tells you. You and your vendors don't have to talk about what the files will contain at all. It's like that wonderful EDI formatting but simpler.

    Ok... I have to go now. 😛

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • I hear the argument "we don't want to be locked in" from the app dev folks - the same ones that aren't afraid to be locked in to their programming language and/or OS. It's a "logic free" argument.

    --Paul Hunter

  • I almost flipped the bozo bit before I got to the fourth paragraph.

    Now that I've read the whole article, I'm really not clear what your message is.

    FWIW,

    -Chris C.

  • Paul Hunter (5/31/2012)


    I hear the argument "we don't want to be locked in" from the app dev folks - the same ones that aren't afraid to be locked in to their programming language and/or OS. It's a "logic free" argument.

    Ah, but the programming language and the OS they understand. The DB is just this black box that spits out the right answers (and at the same time is the source of their problems.)

  • Unless you are willing to dedicate the resources, performance in cross platform data applications will not be optimized. Which is fine if it's a trival app, but can become a quagmire on an a product that requires performance.

  • Chris.C-977504 (5/31/2012)


    Now that I've read the whole article, I'm really not clear what your message is.

    In fact, I’d argue that you need to design precisely for specific relational database management systems because, let’s face it, they don’t implement the fundamentals in the same way. If you mess up the fundamentals, you’ve just messed up your entire design.

    The point is that you can't take a database that's designed for Oracle, change the t-sql to be Microsoft SQL (or MySQL, etc) compliant and expect it to work well. It'll function. It'll just be slow because the core engines are optimized for different DB design.

  • The older version of our app is one of those that supports either a Oracle or SQL Server back end. This is not the first time where I've either read or head someone say it is impossible to have any application that does this efficiently. This isn't the case here. From what I have seen it appears to come down to the development platform when it comes to our app. In the 90's our app was written in a language called CET BASIC. You may not be familiar with it but it was/is a great full featured dev platform that we carried over from THEOS OS when we went to Windows.

    Performance was very good for both Oracle or a SQL server back end. Then we went to a VB based dev platform. The SQL Server back end was still very fast while the Oracle back end lagged in ways that I sometimes found unacceptable. This speed issue had nothing to do with indexes or any other tuning issue. It seemed to me it was related to the data access tools used to get the data out of Oracle. I could verify that the Oracle server was practically asleep waiting for request from the app. I'm not a VB programmer but I always kind of thought Microsoft intentionally wanted access to any DB besides SQL Server to be slower. I have no proof of that specific claim but I have results from testing.

    Cheers

  • Probably being a bit ignorant here but once you're writing T-SQL you're already specializing for MS SQL Server and Sybase ASE.

    So not using vendor-specific stuff (e.g. clustered indexes) after that - that has no impact on the way the application is programmed except for performance - seems ridiculous.

    I mean, even the data types differ per vendor so you'll have to specialize already. Right?

Viewing 15 posts - 1 through 15 (of 64 total)

You must be logged in to reply to this topic. Login to reply