Is sort order important in index?

  • Often I will put a descending sort order on a clustered index, so that when I do a select I generally see the latest rows inserted first (depending of course on whats in the index.)

    Besides that, I have trouble understanding why the order by in the index could have any real significance i.e. I accept it might have the tiniest impact, but can't see when this would be important.

    It's a question I've posed myself on several occasions, but never (til now) got around to posting.

    Any views?

    Thanks,

    David McKinney.

  • Hi David,

    It depends on your queries. Let us say you always query from a table in specific order for example desc, then at the time of execution, the sort operation from the query plan is removed. This makes the query plan more efficient.

    -Roy

  • Roy nailed it. It generally has no impact until the query you're running requires SQL Server to reverse the order of the data. It can't do that for free.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • I wouldn't have thought it would perform any sort operation i.e. it wouldn't re-sort your index just because it was ascending instead of descending. I assumed it would just 'start at the other end'.

    Am i being naive? 😀

  • David McKinney (2/5/2010)


    I wouldn't have thought it would perform any sort operation i.e. it wouldn't re-sort your index just because it was ascending instead of descending. I assumed it would just 'start at the other end'.

    Am i being naive? 😀

    I used to think exactly the same thing but I tested it once... there is a cost.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • David McKinney (2/5/2010)


    Often I will put a descending sort order on a clustered index, so that when I do a select I generally see the latest rows inserted first (depending of course on whats in the index.)

    Besides that, I have trouble understanding why the order by in the index could have any real significance i.e. I accept it might have the tiniest impact, but can't see when this would be important.

    It's a question I've posed myself on several occasions, but never (til now) got around to posting.

    Any views?

    Thanks,

    David McKinney.

    Without an ORDER BY, you're just getting lucky. If parallelism or any number of other things happen, there is no guarantee of order in a SELECT just because of a clustered index. Now, the really neat thing is, if you DO have your clustered index in descending order and you DO use ORDER BY with a DESC direction, you won't see a SORT in the execution plan but you will guarantee the order.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Thanks Jeff,

    Yeah, I know that the sort order isn't guaranteed, even if the index is clustered. It's just pretty much always the case, and so can be handy. What prompted to me to start this practice, is that I found myseft invariably typing

    Select top 100 a,b FROM mytable

    ....F5.....

    ...doh....

    select top 100 a,b FROM mytable desc

    whereas now I can usually just type

    Select top 100 a,b FROM mytable

    ..but I never rely on a particular order unless I explicitly specify one.

    To justify this practice to myself I often rationalise that most queries are more likely to be interested in the latest data than the very first data.

    I almost feel that DESC should be the default!!! (My friend Zacharias agrees.)

  • So this has me confused, as it goes contrary to what I've previously read and experienced.

    As a quick test I ran a select * against a clustered table, ordered both with and against the index order.

    Both were executed by a simple clustered index scan. So what's the cost?

    Edit: looked a little deeper and compared the read statistics. The first time I executed it the only difference between the two was the number of read-aheads, so maybe the cost is that if you defragment the index it'll be optimized for forward reads, which probably isn't optimized for reverse reads?

    That's a pretty minor difference, compared to the effect of poor query plans.



    Dan Guzman - Not the MVP (7/22/2010)
    All questions have to be prefaced by Server version and 'according to MS Docs' or 'my own personal opinion based on how much detail I felt like digging into at the time.'

  • The cost will vary. It truly depends on the size of the table and the scan. Execute the query with the actual execution plan. Highlight your clustered index operation - look at the properties and there will be a cost there.

    The cost will change as the # of rows in the tables queried changes and as the complexity of the query changes.

    Jason...AKA CirqueDeSQLeil
    _______________________________________________
    I have given a name to my pain...MCM SQL Server, MVP
    SQL RNNR
    Posting Performance Based Questions - Gail Shaw[/url]
    Learn Extended Events

  • They have identical costs in the query plan.



    Dan Guzman - Not the MVP (7/22/2010)
    All questions have to be prefaced by Server version and 'according to MS Docs' or 'my own personal opinion based on how much detail I felt like digging into at the time.'

  • I think you are ignoring the cost of what happens when you insert the data.

    I think it would cause intense fragmentation of the index if you inserted a sequentially increasing PK into an table with a clustered PK index sorted in the reverse order. All inserts would happen on the first page, and would cause the first page to split every time it was filled.

    With a normal clustered index order, the inserts would happen on the last page, and it would just allocate a new page once it was filled without causing a page split.

    I haven't done any testing on this because it just sounds like a very bad idea. At the very least, you should test this to see how much index fragmentation you get before implementing it on a production system.

  • Take your query and step it up a few notches. Make it an extremely complex query and then compare the cost on the sort. Also, compare it when you have a table with several million records.

    Jason...AKA CirqueDeSQLeil
    _______________________________________________
    I have given a name to my pain...MCM SQL Server, MVP
    SQL RNNR
    Posting Performance Based Questions - Gail Shaw[/url]
    Learn Extended Events

  • Michael Valentine Jones (2/9/2010)


    I think you are ignoring the cost of what happens when you insert the data.

    I think it would cause intense fragmentation of the index if you inserted a sequentially increasing PK into an table with a clustered PK index sorted in the reverse order. All inserts would happen on the first page, and would cause the first page to split every time it was filled.

    You're right - I hadn't considered it. And it certainly sounds a plausible hypothesis. Can anyone else confirm or deny this theory?

  • weitzera (2/9/2010)


    They have identical costs in the query plan.

    Yeah, they frequently do. You have to remember that costs in query plans are just estimates, not real numbers. They can be a somewhat useful measuring stick, but you have to remember that they're just guesses.

    The costs for reording the data can be trivial or great. As a matter of fact, in some instances, you'll see a different execution plan for ASC/DESC reversing the order.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • Jeff Moden (2/5/2010)


    David McKinney (2/5/2010)


    Often I will put a descending sort order on a clustered index, so that when I do a select I generally see the latest rows inserted first (depending of course on whats in the index.)

    Besides that, I have trouble understanding why the order by in the index could have any real significance i.e. I accept it might have the tiniest impact, but can't see when this would be important.

    It's a question I've posed myself on several occasions, but never (til now) got around to posting.

    Any views?

    Thanks,

    David McKinney.

    Without an ORDER BY, you're just getting lucky. If parallelism or any number of other things happen, there is no guarantee of order in a SELECT just because of a clustered index. Now, the really neat thing is, if you DO have your clustered index in descending order and you DO use ORDER BY with a DESC direction, you won't see a SORT in the execution plan but you will guarantee the order.

    I've definitely seen that first hand on my dual core server. The results would be in two batches due to parallelism. That is, my results would be something like "B-D" and then I'd have "A, E-Z" in another batch. Thinking there was some kind of corruption, I would rebuild the clustered index and the issue would (seemingly) go away, only to return shortly thereafter.

    Once I put the ORDER BY in the query, I got the results as expected, and without a SORT operation (due to the ORDER BY being directly on the CI).

    [font="Tahoma"]Zycon - The Search Engine for Manufacturers[/url]. Helping Engineers and Technical Buyers Locate Aluminum Extrusions, Metal Stampers, Plastic Fabricators, Valves, and More.[/font]

Viewing 15 posts - 1 through 14 (of 14 total)

You must be logged in to reply to this topic. Login to reply