SELECT TOP

  • Nice question. If there was an extra option "It Depends", i would have selected this option as the correct answer.

  • ronmoses (12/21/2010)


    For educational purposes, I would genuinely appreciate it if one of the folks who take issue with the "most of the time" factor could illustrate a scenario in which that script returns different results.

    I read through all the comments to be sure the "order by" was already.. noted. (as if there would be anything for me to add after the first 10 posts)

    I don't have time to make a reproducible illustration but wanted to throw in these two cents: With large data sets and parallelization you can get apparently unordered results even with the clustered index in effect due to the merge step simply combining multiple streams. Without an explicit order by clause, you get the unsorted merged streams. I discovered this trying to remove the cost of the order by under the erroneous assumption that the clustered index made the order by unnecessary. No, it's very necessary.

  • Mike Dougherty-384281 (12/21/2010)


    ronmoses (12/21/2010)


    For educational purposes, I would genuinely appreciate it if one of the folks who take issue with the "most of the time" factor could illustrate a scenario in which that script returns different results.

    I read through all the comments to be sure the "order by" was already.. noted. (as if there would be anything for me to add after the first 10 posts)

    I don't have time to make a reproducible illustration but wanted to throw in these two cents: With large data sets and parallelization you can get apparently unordered results even with the clustered index in effect due to the merge step simply combining multiple streams. Without an explicit order by clause, you get the unsorted merged streams. I discovered this trying to remove the cost of the order by under the erroneous assumption that the clustered index made the order by unnecessary. No, it's very necessary.

    So which was better performance wise? maxdop 1 or 0?

    Yes I know the results are wrong without the order by!

  • Good Question... I agree that not having an order by statement can cause the output to change. I also agree that the clustered index created in the script also had this affect.

    I also agree that this table schema would need a re-write for a production transactional database project. Thier are several articles in BOL and other places that would point that out.

    1. A table identity column should be part of the primary key for the table.

    2. If a table has a clustered index, it should include the primary key.

  • BOL (http://msdn.microsoft.com/en-us/library/ms189463.aspx) states clearly

    If the query has no ORDER BY clause, the order of the rows is arbitrary.

    hence putting "Ann" as the correct answer is simply wrong.

    Nevertheless, using TOP without an ORDER BY clause may definitely be useful: I like to use it to see just a sample row of a table not well known to me...;-)

    Best regards,
    Dietmar Weickert.

  • Always use the top with an ORDER BY clause!

    I got the question correct, but this is the last sentance of the answer's explanation.

    Why?

    Why should I always use TOP with an order by clause? What benefit do I get with say TOP 100% when I decide i want all the records to come back, just ordered?



    --Mark Tassin
    MCITP - SQL Server DBA
    Proud member of the Anti-RBAR alliance.
    For help with Performance click this link[/url]
    For tips on how to post your problems[/url]

  • Dietmar Weickert (12/21/2010)


    BOL (http://msdn.microsoft.com/en-us/library/ms189463.aspx) states clearly

    If the query has no ORDER BY clause, the order of the rows is arbitrary.

    hence putting "Ann" as the correct answer is simply wrong.

    Nevertheless, using TOP without an ORDER BY clause may definitely be useful: I like to use it to see just a sample row of a table not well known to me...;-)

    It certainly seemed more arbitrary in the days of SQL 7 and SQL 2000, I remember trying to figure out what order records were coming back when I didn't use ORDER BY.

    These days I begin to wonder if the optimizer doesn't just return them in clustered index order when the ORDER BY clause is omitted (and a clustered index is present), and Microsoft lists in BOL that the order is arbitrary so that if they need to change the optimizer for some reason with a service pack, they can say "We told you it was arbitrary".



    --Mark Tassin
    MCITP - SQL Server DBA
    Proud member of the Anti-RBAR alliance.
    For help with Performance click this link[/url]
    For tips on how to post your problems[/url]

  • Hi Steve,

    Can you please remove this question, because it is plain wrong:

    --Select the first customer

    SELECT TOP 1 * FROM #Customer

    Without ORDER BY, the order is not guaranteed.

    Thanks!

    AK

  • Just to be the devils advocate...

    1 - this is a local temp table. So only this connection is going to read from it which eliminates joining another active read on the data.

    2 - there's only 1 page of data, so there's no way you'll get parralelism.

    3 - The clustered index is clearly supplied in the question.

    4 - This is a trivial plan, so the most likely plan is a clustered scan.

    5 - As far as I know, the data is ordered IN the page itself (not 100% sure).

    Yes I know it still depends, but there's not much more else that could screw with the current "correct" answer :w00t:.

  • I believe the ORDER BY concept should also apply to the explanation because it would make more sense to say "always include an Order By when using Top" instead of "Always use the top with an ORDER BY clause". However, neither case is "always" true. Overall I get the point but as a question with three distinct answer choices and no real distinct answer it seems flawed at best.

  • Adding a clustered index is equivalent to adding a primary key and doesn't determine order. I don't agree with this answer.

  • donvon40 (12/21/2010)


    Adding a clustered index is equivalent to adding a primary key and doesn't determine order. I don't agree with this answer.

    Categorically untrue. A primary key is a unique logical key. SQLServer happens to implement it by creating a physical index. By default this index is clustered and unique (as the PK is logically unique) but the index does NOT need to be clustered. It can be created as a non-clustered, unique index if you so choose.

    A clustered index can be created as a unique index but it does NOT need to be unique. You can create a non-unique, clustered index. Data will be ordered based on the clustered index if it exists, although as pointed out throughout this post, the use of "top" does not guarantee the order any longer.

    Data will not be ordered based on the PK unless the PK is implemented as a clustered index.

    Frequently the PK is in fact a poor choice for the clustered index as clustered indexes are well suited to "range searches", and people tend to implement an identity column as the PK rather than to use the "business key", but that is another debate in itself. However like most situations, it depends on your design and the queries being run as to what is the best choice of clustered index.

    The original statement above is therefore completely inaccurate. A clustered index is NOT equivalent to a primary key.

    Regards

    Roddy

  • http://msdn.microsoft.com/en-us/library/ms175132.aspx

    We're all learning.:-)

  • The PHYSICAL order of data is only guaranteed to match the clustered index just after it is created or rebuilt. In this case the index is created and the data immediately read, no inserts or updates are made to the table, so in this instance it is a fair assumption that 'Ann' will be returned first (and the only sensible assumption from the choices available).

    Nice discussion on this from the QOTD of october 5th

    http://www.sqlservercentral.com/Forums/Topic998040-274-1.aspx

    ---------------------------------------------------------------------

  • pavanr (12/21/2010)


    I agree that without an ORDER BY clause, the returned result has no meaning.

    but can anyone explain, here in the case of Clustered Index, why it picks third record ?

    Does this depend on Order of Insertion ie., Identity values ?

    When you create a clustered index on a table, the data in the table is logically re-ordered to match the index key. A side effect of that is that a SELECT against the data is likely to return data ordered according to the clustered index regardless of an ORDER BY clause, but this isn't guaranteed.

Viewing 15 posts - 16 through 30 (of 50 total)

You must be logged in to reply to this topic. Login to reply