Ways to minimize sort operations

  • Comments posted to this topic are about the item Ways to minimize sort operations

    Regards
    Brian Ellul
    ----------------------------------------------------------------------------------
    Computers are incredibly fast, accurate, and stupid. Human beings are incredibly slow, inaccurate, and brilliant. Together they are powerful beyond imagination.
    - Albert Einstein -

  • I'm a little confused here. What is the intention of the article? To avoid unnecessary Sort operators in the execution plan or to eliminate ORDER BY and still get the results sorted?

    If the idea is to eliminate the need for ORDER BY then what this article describes should not be trusted. There is no guarantee that an index (clustered or non-clustered) will return the result in an ordered manner. IT might work today, but what happens when the amount of data or statistics changes? Will the QO find a better plan and havoc the ordered results that used to be returned to the client? It might very well do so.

    There is only one way to guarantee ordered results. Say it after me: ORDER BY.

    On the other hand, if you don't want SQL Server to waste memory, disk I/O and CPU on sorting, then you could send the result to the client and let it do the sorting. There is usually lots of unused CPU capacity on the client computers.

  • Like Nils, I dont quite understand the reason for the article, in 95% of cases the sort operation is the least of your worries in a plan, especially when you've joined 2 or 3 tables together to bring back an amalgamaged dataset, such as Stock levels by product, and unfilled orders.

    I'm not a fan of the Non-clustered Index with all other columns in an Include, one or two columns that are commonly associated with the business key, eg Indexed on ProductNumber, with the ProductName and ProductCode as include columns is reasonable.

    _________________________________________________________________________
    SSC Guide to Posting and Best Practices

  • So the whole article is basically saying adding an index can assist in the performance of queries containing ORDER BY.

  • I dont see how people are saying this is dangerous, from looking at his code he does put Order by on each of the select statments so the data will always be ordered.

  • I agree with Steve - I'm not sure you have paid full attention to the SQL in the article. He is not suggesting removing order by, but creating indexes to optimise for the clause.

    Having said that, I can't remember ever creating an index to support an order by. As said, there are countless other factors far more important.

    I do have one issue with the article: there is no reason the IO should have been greater for the first query than the subsequent ones. I suspect that this was actually due to fragmentation of the original clustered index that of course was not present in the indexes created specifically for the tests. As such, this measure of improvement is false.

  • steveb. (9/15/2010)


    I dont see how people are saying this is dangerous, from looking at his code he does put Order by on each of the select statments so the data will always be ordered.

    Oops, my bad!! I apologize :crying:

  • This might be a little nit-picky, so I apologise in advance, but I understood the DBCC DROPCLEANBUFFERS command to only clear clean buffers from the buffer pool. If you don't use the CHECKPOINT command before calling DBCC DROPCLEANBUFFERS, you haven't forced dirty pages out of the page cache onto disk to maximise the effectiveness of DBCC DROPCLEANBUFFERS. (See http://msdn.microsoft.com/en-us/library/ms187762.aspx)

    If I'm right, this may also cause misleading performance results (or at least difficult-to-repeat performance results) between executions.

  • Very nice article - Thanks!

    I'm especially glad to see the effect of an update statement on the different methods discussed. And as you so properly point out, the proper answer is "it depends".

    Wayne
    Microsoft Certified Master: SQL Server 2008
    Author - SQL Server T-SQL Recipes


    If you can't explain to another person how the code that you're copying from the internet works, then DON'T USE IT on a production system! After all, you will be the one supporting it!
    Links:
    For better assistance in answering your questions
    Performance Problems
    Common date/time routines
    Understanding and Using APPLY Part 1 & Part 2

  • Nils Gustav Stråbø (9/15/2010)


    I'm a little confused here. What is the intention of the article? To avoid unnecessary Sort operators in the execution plan or to eliminate ORDER BY and still get the results sorted?

    If the idea is to eliminate the need for ORDER BY then what this article describes should not be trusted. There is no guarantee that an index (clustered or non-clustered) will return the result in an ordered manner. IT might work today, but what happens when the amount of data or statistics changes? Will the QO find a better plan and havoc the ordered results that used to be returned to the client? It might very well do so.

    There is only one way to guarantee ordered results. Say it after me: ORDER BY.

    On the other hand, if you don't want SQL Server to waste memory, disk I/O and CPU on sorting, then you could send the result to the client and let it do the sorting. There is usually lots of unused CPU capacity on the client computers.

    Thanks for your feedback!

    The scope behind this article is to show how an already sorted result set will improve query performance since the data is already sorted. Let's take a DWH were having multiple indexes and/or Indexed views are not an issue since the data is preloaded during the ETL process. This method will improve query performance when running SQL or Reports on the source table.

    And yes, I'll say it after you.... the ORDER BY will always guarantee that the data set is ordered, in fact that's why I did also specify that in the SELECT statements. 🙂

    With regards to your last suggestion, yes that can be an option, however I'm trying to keep all processing centralized on the server, which is accessible by all. I would argue that if a data set is always required ordered by a specific column/s and multiple users need to access this data, why not sort it for everyone in the first place.

    Regards
    Brian Ellul
    ----------------------------------------------------------------------------------
    Computers are incredibly fast, accurate, and stupid. Human beings are incredibly slow, inaccurate, and brilliant. Together they are powerful beyond imagination.
    - Albert Einstein -

  • Brian, I'm have to apologize to you because for some reason my eyes and brain didn't see that you kept the ORDER BY clause. For some reason I thought that the article explained a way to get rid of the ORDER BY, but when I read it again I don't know where that idea came from :rolleyes:

    Again, I'm sorry. Its a good article !

  • Jason Lees-299789 (9/15/2010)


    Like Nils, I dont quite understand the reason for the article, in 95% of cases the sort operation is the least of your worries in a plan, especially when you've joined 2 or 3 tables together to bring back an amalgamaged dataset, such as Stock levels by product, and unfilled orders.

    I'm not a fan of the Non-clustered Index with all other columns in an Include, one or two columns that are commonly associated with the business key, eg Indexed on ProductNumber, with the ProductName and ProductCode as include columns is reasonable.

    Thanks for your feedback!

    I would put my worries when studying a plan where I have the biggest cost, which in this case I wanted it to be the sort operator, and that's why I didn't use a complicated SELECT statement. Yes, I agree, if the SORT operator is only a fraction of the total query cost I would not bother with improving it!

    Regards
    Brian Ellul
    ----------------------------------------------------------------------------------
    Computers are incredibly fast, accurate, and stupid. Human beings are incredibly slow, inaccurate, and brilliant. Together they are powerful beyond imagination.
    - Albert Einstein -

  • Nils Gustav Stråbø (9/15/2010)


    Brian, I'm have to apologize to you because for some reason my eyes and brain didn't see that you kept the ORDER BY clause. For some reason I thought that the article explained a way to get rid of the ORDER BY, but when I read it again I don't know where that idea came from :rolleyes:

    Again, I'm sorry. Its a good article !

    No problem and welcome 🙂

    Regards
    Brian Ellul
    ----------------------------------------------------------------------------------
    Computers are incredibly fast, accurate, and stupid. Human beings are incredibly slow, inaccurate, and brilliant. Together they are powerful beyond imagination.
    - Albert Einstein -

  • It's always refreshing to read clear, consise articles that get your grey matter spinning, even though they might not apply to you directly.

    Nice job Brian!

  • Thank you for taking the time and effort to submit:-) this article. I always appreciate reading about how to optimize queries and actually being able to understand what the writer is trying to convey. I also appreciate the other feedback as a reality check!

Viewing 15 posts - 1 through 15 (of 20 total)

You must be logged in to reply to this topic. Login to reply