SELECT TOP 1 on Primary Key?

  • lisalee

    Mr or Mrs. 500

    Points: 511

    I have a table (let's call it TableA) with a primary key defined (let's say the column is called ColumnPrimary). Would doing a SELECT TOP 1 on this table based on the primary key be any faster/more efficient than simply doing a SELECT on the table based on the primary key? For instance, SELECT TOP 1 * FROM TableA WHERE ColumnPrimary = 100 vs. SELECT * FROM TableA WHERE ColumnPrimary = 100? I could see how TOP 1 might be more efficient because once it finds one (in this case the only) match, it stops searching, since TOP 1 is satisfied; however, I'm thinking maybe the optimizer is smart enough to know that we're searching on the primary key and therefore it should automatically stop searching once the record is found. Any thoughts?

  • Chris Harshman

    SSC-Forever

    Points: 41850

    I haven't done any trials on this scenario, but my initial thoughts would be that TOP1 in this case would have ever so slightly negative or no effect on performance, especially if your primary key is a clustered index. The WHERE clause condition you specified would cause an index seek to happen, always finding either 0 or 1 row, even if the query was parameterized.

    If you look at an execution plan in Management Studio, you'll see that the TOP operation is usually the last step before the actual output of the SELECT statement, so it doesn't optimize the earlier steps knowing that it only needs 1 record.

  • lisalee

    Mr or Mrs. 500

    Points: 511

    Awesome reply! That helps me so much. Thank you so much for your help!

  • Matt Miller (4)

    SSC Guru

    Points: 124187

    If it's truly the primary key, then the TOP 1 is not going to help.

    I personally dislike (where dislike = is something I bounce back during code reviews) using TOP without an order by, for a couple of reasons:

    - if your TOP XXX call is trying to bring back LESS than what the filter would, then your query becomes "non-deterministic", meaning it may not return the same rows each time (since it can and will return ANY of the bigger pool to satisfy the TOP XXX). Sloppy, dangerous and usually causes a lot of questions from end-users.

    - if your TOP XXX call tries to bring back MORE than the filter has, then the TOP has the nasty side effect of forcing the ENTIRE result set of the query to be materialized (while it desperately looks for "enough" rows to match your TOP statement), instead of the typical TOP behavior (meaning - stop executing as soon as you have enough rows).

    ----------------------------------------------------------------------------------
    Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?

  • AdrianLParker+sqlservercentral.com

    SSC Enthusiast

    Points: 125

    No, top in your case is not useful (and is likely detrimental processing-wise).

    TOP 1 is evaluated after the entire query is processed. It does not short circuit the scanning of a table. For example, let's say your query instead returned 50 records. It couldn't know which the top record was until the retrieved records were ordered. "TOP" is only useful when you want to return fewer records than the query would otherwise.

  • Jonathan AC Roberts

    SSCoach

    Points: 17010

    AdrianLParker+sqlservercentral.com - Friday, November 9, 2018 8:09 AM

    No, top in your case is not useful (and is likely detrimental processing-wise).

    TOP 1 is evaluated after the entire query is processed. It does not short circuit the scanning of a table. For example, let's say your query instead returned 50 records. It couldn't know which the top record was until the retrieved records were ordered. "TOP" is only useful when you want to return fewer records than the query would otherwise.

    I know this thread is 10 years old but anyway. There is a hint you can give SQL Server to return one row quickly: OPTION (FAST 1)
    But if you specify the value and it's a primary key (Unique) it should use the index. 
    Adding TOP(1) to a query can change the execution plan (as I have just tried) by changing the value in the TOP. In my test:
    select top(1) *
    from myTable
    where datestamp > '20170101'
    order by datestamp

    Has a different execution plan to 
    select top(100000000) * 
    from myTable
    where datestamp > '20170101'
    order by datestamp

  • AdrianLParker+sqlservercentral.com

    SSC Enthusiast

    Points: 125

    The MSDN says of OPTIONA (FAST X):

    Specifies that the query is optimized for fast retrieval of the first number_rows. This is a nonnegative integer. After the first number_rows are returned, the query continues execution and produces its full result set.

    ...so if it scans the entire table anyway, does it really do anything since you can't see the results until the database server returns them all?

  • Jonathan AC Roberts

    SSCoach

    Points: 17010

    AdrianLParker+sqlservercentral.com - Friday, November 9, 2018 10:12 AM

    The MSDN says of OPTIONA (FAST X):

    Specifies that the query is optimized for fast retrieval of the first number_rows. This is a nonnegative integer. After the first number_rows are returned, the query continues execution and produces its full result set.

    ...so if it scans the entire table anyway, does it really do anything since you can't see the results until the database server returns them all?

    re: does it really do anything since you can't see the results until the database server returns them all?
    You can see rows before the query finishes executing.

  • Lynn Pettis

    SSC Guru

    Points: 442180

    Considering that the queries shown in the original post both have the same where clause,WHERE ColumnPrimary = 100, and the ColumnPrimary is the primary key (unique and not null) using TOP (1) is extraneous and unneeded.

  • AdrianLParker+sqlservercentral.com

    SSC Enthusiast

    Points: 125

    @Lynn Pettis
    But I understand that the SQL engine doesn't short-circuit on primary keys. Example, once it finds the record with columnPrimary=100 it continues to scan the indices to find other matches. But, for same of argument, pretend the column isn't a primary key or even unique.

  • Lynn Pettis

    SSC Guru

    Points: 442180

    AdrianLParker+sqlservercentral.com - Friday, November 9, 2018 1:49 PM

    @Lynn Pettis
    But I understand that the SQL engine doesn't short-circuit on primary keys. Example, once it finds the record with columnPrimary=100 it continues to scan the indices to find other matches. But, for same of argument, pretend the column isn't a primary key or even unique.

    Why would it continue searching?  That defies what is meant by a PRIMARY KEY: unique (only 1) and not null.

  • AdrianLParker+sqlservercentral.com

    SSC Enthusiast

    Points: 125

    @Lynn Pettis
    A
    lthough it has a unique constraint, otherwise it's just stored as any other index is, is it not? Does the SELECT statement differentiate between a where clause against a unique field (be it primary key or other unique index) or a unique field?

    For example, does "where somePrimaryColumn = 3" short circuit when it finds the first matching record and "where somePrimaryColumn >= 3" does not? Is the engine that clever?

  • Lynn Pettis

    SSC Guru

    Points: 442180

    AdrianLParker+sqlservercentral.com - Friday, November 9, 2018 2:59 PM

    @Lynn Pettis
    A
    lthough it has a unique constraint, otherwise it's just stored as any other index is, is it not? Does the SELECT statement differentiate between a where clause against a unique field (be it primary key or other unique index) or a unique field?

    For example, does "where somePrimaryColumn = 3" short circuit when it finds the first matching record and "where somePrimaryColumn >= 3" does not? Is the engine that clever?

    There is ONLY one matching row of data when the where clause is WHERE somePrimaryColumn = 3.  The second where clause you propose is looking for 1 or more rows of data WHERE comePrimaryColumn >= 3.  They are not the same query.

  • AdrianLParker+sqlservercentral.com

    SSC Enthusiast

    Points: 125

    Lynn Pettis - Friday, November 9, 2018 3:14 PM

    AdrianLParker+sqlservercentral.com - Friday, November 9, 2018 2:59 PM

    @Lynn Pettis
    A
    lthough it has a unique constraint, otherwise it's just stored as any other index is, is it not? Does the SELECT statement differentiate between a where clause against a unique field (be it primary key or other unique index) or a unique field?

    For example, does "where somePrimaryColumn = 3" short circuit when it finds the first matching record and "where somePrimaryColumn >= 3" does not? Is the engine that clever?

    There is ONLY one matching row of data when the where clause is WHERE somePrimaryColumn = 3.  The second where clause you propose is looking for 1 or more rows of data WHERE comePrimaryColumn >= 3.  They are not the same query.

    I realize what those queries can return. Despite intuitively knowing that the first can only return a single row, does the SQL engine know that and short circuit on the query?

    Does the SQL engine treat the first query any differently than the second? Or, in the first does it still scan all indices to find matches (despite the fact that you and I know that it can only return one row)?

  • Lynn Pettis

    SSC Guru

    Points: 442180

    AdrianLParker+sqlservercentral.com - Friday, November 9, 2018 8:26 PM

    Lynn Pettis - Friday, November 9, 2018 3:14 PM

    AdrianLParker+sqlservercentral.com - Friday, November 9, 2018 2:59 PM

    @Lynn Pettis
    A
    lthough it has a unique constraint, otherwise it's just stored as any other index is, is it not? Does the SELECT statement differentiate between a where clause against a unique field (be it primary key or other unique index) or a unique field?

    For example, does "where somePrimaryColumn = 3" short circuit when it finds the first matching record and "where somePrimaryColumn >= 3" does not? Is the engine that clever?

    There is ONLY one matching row of data when the where clause is WHERE somePrimaryColumn = 3.  The second where clause you propose is looking for 1 or more rows of data WHERE comePrimaryColumn >= 3.  They are not the same query.

    I realize what those queries can return. Despite intuitively knowing that the first can only return a single row, does the SQL engine know that and short circuit on the query?

    Does the SQL engine treat the first query any differently than the second? Or, in the first does it still scan all indices to find matches (despite the fact that you and I know that it can only return one row)?

    So, the database engine finds the one and only row of data with the given primary key value but it is going to continue checking the other 100,000,000 rows of data just to be sure that there isn't another one sitting around some where.  Right.  Equality condition, unique value (ONLY ONE if it exists), it finds it or it doesn't.  It isn't going to keep searching once it has found the one value it was looking for.

Viewing 15 posts - 1 through 15 (of 32 total)

You must be logged in to reply to this topic. Login to reply