Blog Post

TOP 1 killed my performance; Rob Farley helped me resurrect it.

,

tl;dr; There is no short version of this. It’s LONG. It’s interesting but you might want to skip it if you are in a hurry. If you have the time however, it’s a great walk through of tuning a query.

I had a great time today. I was handed a fairly random query by a co-worker/friend and asked if I could help speed it up some. Now after looking at it a bit I decided it was .. odd. Leaving aside any coding issues (it used an old style join) it was SLOW (not done after 12 minutes). But only when it had a TOP 1. If I removed the TOP 1 it ran in about 2 seconds. I realize that certain phrases (TOP, FAST) in a query change the Row Goal (the number rows the optimizer expects to get back or get back quickly) and that can change the query plan, even dramatically, but I had no clue why the big slow down. I decided to ask the experts and put a question into dba.stackexchange. I got a couple of great answers (You should go read them). Certainly better than my initial attempt. Which was to dump the ~5k rows into a temp table and get the TOP 1 from the temp table.

A little while later Rob Farley (b/t) contacted me and started asking a few questions about it. It turned into him walking me through tuning the query. It was an awesome experience and with his permission I decided to share. I took screen shots of the conversation and I’ll go ahead and make comments after each image with my impressions and any information I think would be helpful.

Let’s start by setting up a test environment. Adam Machanic (b/t) gave me a great starting point and I’ve modified it a bit to make it easier to follow along.

USE tempdb
GO
CREATE TABLE Big_Table (Sort_Id int NOT NULL, 
Join_Id int, Other_Id int NOT NULL)
ALTER TABLE Big_Table ADD CONSTRAINT id 
PRIMARY KEY CLUSTERED (Sort_Id, Other_Id)
GO
WITH a AS (SELECT 1 AS i UNION ALL SELECT 1),
b AS (SELECT a.i FROM a, a AS x),
c AS (SELECT b.i FROM b, b AS x),
d AS (SELECT c.i FROM c, c AS x),
e AS (SELECT d.i FROM d, d AS x),
f AS (SELECT e.i FROM e, e AS x)
INSERT Big_Table
SELECT TOP(1500000)
ROW_NUMBER() OVER (ORDER BY (SELECT null)),
ROW_NUMBER() OVER (ORDER BY (SELECT null)) + 
(CHECKSUM(NEWID()) % 100),
1
FROM f
GO
-- Index left over from me flailing around
CREATE INDEX ix_Big_Table ON Big_Table (Sort_Id, Join_Id)
CREATE TABLE Small_Table (PK_Id int NOT NULL IDENTITY(1,1) 
CONSTRAINT pk_Small_Table PRIMARY KEY, 
Join_Id int, Other_Col int, Other_Col2 char(1), 
Where_Col datetime)
GO
INSERT Small_Table
SELECT TOP(5000)
Join_Id,
1, 'a',
DATEADD(MINUTE, -Join_Id, GetDate())
FROM Big_Table
ORDER BY Join_Id DESC
GO

Note: If you want to follow along you’ll need to set your cost threshold for parallelism to at least 30 (not sure what the lowest limit is but 30 is what I have) to avoid some parallel plans that you’ll get if it’s at the default 5.

EXEC sp_configure 'cost threshold for parallelism', 30;
RECONFIGURE;

Next here is the initial query (cleaned up a little bit and modified to suit the demo).

SELECT TOP 1 ST.Join_Id,
        ST.Other_Col,
        ST.Other_Col2,
        ST.PK_Id,
        BT.Sort_Id
FROM Small_Table ST
JOIN Big_Table BT
    ON ST.Join_Id = BT.Join_Id
WHERE ST.Where_Col < = GETDATE()
ORDER BY BT.Sort_Id;
GO

Here is the estimated query plan (remember I never waited for it to finish)
RobPerformanceQP1

And when I remove the TOP 1 I got this query plan and it ran in 2 seconds.
RobPerformanceQP2

Last but not least here is the initial answer I went with (from the DBA.se question above). I changed the INNER JOIN to an INNER HASH JOIN and it ran in just under a second. I generally try to avoid hints but in this case it seemed acceptable.

SELECT TOP 1 ST.Join_Id,
        ST.Other_Col,
        ST.Other_Col2,
        ST.PK_Id,
        BT.Sort_Id
FROM Small_Table ST
INNER HASH JOIN Big_Table BT
    ON ST.Join_Id = BT.Join_Id
WHERE ST.Where_Col <= GETDATE()
ORDER BY BT.Sort_Id;
GO

And here is the query plan.
RobPerformanceQP3

So now we are all at the same starting point. Here is where Rob comes in. I’ve made some obvious changes here and there to what he said so the column and table names match the demo code. Also there was a bit extra in the WHERE clause that I eliminated because it didn’t really have an effect on the outcome.

RobPerformance01_2

Here Rob suggests an index that I admit I should have thought of myself. I decided the table was so small it didn’t really matter. Silly mistake. While Rob is “talking” I’m trying it out. Note: The index he suggested was a filtered index (it had a WHERE clause). Since I’ve removed that condition from the query I removed it from the index. You’ll see a few references to it at the beginning.

RobPerformance02_2

Just in case it’s not obvious, when Rob talks about dive in and try one row what he means is that it’s going to pull the first row of Big_Table (based on the Sort_Id) and then look to see if there is a matching value in Small_Table. The optimizer feels this would be faster than building a HASH TABLE across the entire Small_Table even though it’s fairly small.

RobPerformance03

Here is the new index I created.

CREATE INDEX ix_Small_Table ON Small_Table(Join_Id, Where_Col) 
     INCLUDE (Other_Col, Other_Col2, PK_Id)

And here is the new query plan. And you get to actually see it. Rob had to put up with my description.
RobPerformanceQP4

Better, but still a little slower than the initial fixed version that used the INNER HASH JOIN.

RobPerformance05_2
RobPerformance06

Here we are looking at the difference between the estimated and actual number of rows for an element of the plan. To look at this information you can either mouse over the element or right click and open the properties tab. In this case you will see that the estimated number of rows (what the optimizer thought would happen) is fairly low (117) particularly compared to what actually happened (1494900). When you see a big difference like that in a query plan there is something wrong.
RobPerformanceQP5_2

RobPerformance07

Here we are discussing how the current plan works. Top 1 makes the optimizer think that it will get the answer quickly. So it loops through the Big_Table (it being the table that’s ordered) and then looks into Small_Table to see if there is a match. And as Rob says if that look into the Small_Table is a scan across 5000 rows it could take some time.

RobPerformance08

RobPerformance09

Here Rob starts describing a way to look at how the optimizer works. You can pretend it’s just paper and start using logic. In this particular case looking through a phone book, to say find someone on a particular street. Rob also suggests adding 1 to the sort order. (I’m particularly impressed at how closely he predicts what the plan is going to look like.)

New query:

SELECT TOP 1 ST.Join_Id,
        ST.Other_Col,
        ST.Other_Col2,
        ST.PK_Id,
        BT.Sort_Id
FROM Small_Table ST
JOIN Big_Table BT
    ON ST.Join_Id = BT.Join_Id
WHERE ST.Where_Col <= GETDATE()
ORDER BY BT.Sort_Id + 1;
GO

And here is where things went sideways in my demo.

Here is the plan I ended up with.

RobPerformanceQP7

And here is the plan I expected.

RobPerformanceQP6

If you look you’ll see that in the plan I got Big_Table is still on top and I get the long running query. In the one I expected (and got at work) Small_Table is on top and the run is sub second. So what’s going on? After a number of frustrating hours and some more help from Rob I finally tracked down the problem. In my demo the Sort_Id is an integer, at work it was a character. And just so you know the column was named “Number” so I think it’s pretty reasonable I was confused. On a side note this is a great example of why the table structures matter when asking a question on a forum.

Ok, so why did the data type mater so much in this case? Well the idea here is to break the sort without actually breaking it. In other words we want the sort column to no longer be SARGable (i.e. ignore the fact that we have an index) but not actually change the sort order (we don’t want to affect our output). Adding a 1 to a character column forces an implicit data type conversion (and as it happens we can no longer guarantee the same order of the data) but SQL is smart enough to know that adding a constant integer to an integer column doesn’t really affect anything.

So obviously adding a +1 isn’t exactly the right answer. So what is? Well in the case of an integer I added RAND(4) (no particular reason for the 4, any int will do). The RAND() function returns a value between 0 and 1, but most importantly, per BOL, Repetitive calls of RAND() with the same seed value return the same results. So the actual row order doesn’t change. But because RAND() is non-deterministic SQL treats the ORDER BY as non-SARGable.

For a character string it’s even easier. Just add an empty string (”). I’m not entirely sure why it causes the sort to be non-SARGable but it does.

New New query:

SELECT TOP 1 ST.Join_Id,
        ST.Other_Col,
        ST.Other_Col2,
        ST.PK_Id,
        BT.Sort_Id
FROM Small_Table ST
JOIN Big_Table BT
    ON ST.Join_Id = BT.Join_Id
WHERE ST.Where_Col <= GETDATE()
ORDER BY BT.Sort_Id + RAND(4);
GO

Query plan is the expected one above.

RobPerformance10
RobPerformance11

We are doing an index scan on Big_Table which isn’t great since it’s a 1.5 million row table. A seek would be much better but that would require me using the correct index!

-- Drop incorrect index
DROP INDEX Big_Table.ix_Big_Table;
-- Put index back but in the right direction 
-- (join, sort rather than sort, join)
CREATE INDEX ix_Big_Table ON Big_Table (Join_Id, Sort_Id);

RobPerformance12

New Query Plan

RobPerformanceQP8

With a time of around 30ms! Big change from even the seven seconds produced by my first fix.

The rest is just review. I’ve left it here for you to read but I won’t be making any more comments.

RobPerformance13
RobPerformance14
RobPerformance15
RobPerformance16
RobPerformance17

Filed under: Microsoft SQL Server, Performance, SQLServerPedia Syndication, T-SQL Tagged: microsoft sql server, Performance, T-SQL

Rate

You rated this post out of 5. Change rating

Share

Share

Rate

You rated this post out of 5. Change rating