TOP vs Max/Min: Is there a difference?

Mike Byrd, 2020-08-14 (first published: 2018-12-17)

I’m always trying to learn more about how the black box optimizer works and recently have been wondering how it resolves the following queries:

The dbo.Entity table contains 31,404,767 rows. All 4 queries have the same IO properties: Table 'Entity'. Scan count 1, logical reads 4, physical reads 0, etc., and the 4 query plans looking like this:

Hmmm, interesting! I would have thought the optimizer might come up with the same query plan for each query, but the TOP queries end up with a slightly more complex plan. Let’s examine the 4 plans and see what the differences are.

All 4 seem to be equal according to the query cost (relative to the batch): 25%, but looking at the actual numbers Queries 1 & 2 have a subtree cost

This is not a big difference (0.0000043), but it is a difference. Looking closer at the difference between the two plans the difference is in Query 1 & 2 you have the Stream Aggregate operator and in Query 3 & 4 you have the extra Compute Scalar, Constant Scan and Nested Loop operators.

I’m not going to discuss the relative merits of these operators, but one factor not shown here is that all 4 queries gave cpu times of 0 secs (really this is less than 0.5 milliseconds), but the elapsed times for all 4 queries consistently showed increased times of 4-8 for queries 3 & 4 than queries 1 & 2.

Queries 1 and 3 end up with a Forward scan of the Clustered Index (MIN) and Queries 2 and 4 with a Backward scan of the Clustered Index (Max) each returning a single row as shown in diagram below:

This is the result (and the answer) I expected from all 4 queries, but really do wonder why there all the other operators in the query plan -- I thought the optimizer was "smart" enough to recognize simple queries and not need the extra operators. 🙁

Finally, something I did not initially expect to see were the 4 logical reads. Upon review of the following query

dbcc showcontig ('dbo.Entity') with tableresults, all_indexes, all_levels;

The partial results are

ObjectName	IndexName	IndexID	Level	Pages	Rows
Entity	PK_Entity	1	0	257776	31404767
Entity	PK_Entity	1	1	792	257776
Entity	PK_Entity	1	2	3	792
Entity	PK_Entity	1	3	1	3

With 4 levels in the b-tree, even though one row was returned the queries still need to traverse 4 levels (i.e., 4 pages) to get the one row data.

So, what did we learn? It appears that the aggregate functions MAX and MIN and the TOP function all start with the same Index scan, but how the optimizer then uses the intermediate results is slightly different. Is there a big performance hit using TOP. I would say no, but if these TSQL statements were called 1000s of times over a short period time you might want to consider the MAX and MIN in lieu of the TOP function.

Displaying Hierarchical Data

by Adam Aspin

SQLServerCentral.com

T-SQL

Producing hierarchies from SQL tables can necessitate joining a table to itself. This article will explain how you can do this.

★ ★ ★ ★ ★ ★ ★ ★ ★ ★

4.08 (12)

You rated this post out of 5. Change rating

2020-03-06 (first published: 2018-08-30)

24,481 reads

Discuss

Count the Number of Weekend Days between Two Dates

by Adam Aspin

SQLServerCentral.com

T-SQL

Handling weekends can be tricky in SQL. This article shows you how to Count the Number of Weekend Days between Two Dates

★ ★ ★ ★ ★ ★ ★ ★ ★ ★

2.67 (9)

You rated this post out of 5. Change rating

2020-10-16 (first published: 2018-08-23)

11,903 reads

Discuss

Aggregate Data for the Last Day of the Month

by Adam Aspin

SQLServerCentral.com

T-SQL

On some occasions you will need to aggregate Data for the Last Day of the Month. This article explains how.

★ ★ ★ ★ ★ ★ ★ ★ ★ ★

5 (8)

You rated this post out of 5. Change rating

2020-04-03 (first published: 2018-08-16)

6,740 reads

Discuss

OUTPUT Clause Basics (For Inserts and Deletes)

by Mat Richardson

SQLServerCentral.com

T-SQL

When INSERTING or DELETING rows from a table, the OUTPUT clause can be used to return a dataset containing the changes made. Mat Richardson explains how.