So I've been trying to troubleshoot the "sometimes the app is slow, so sql server indexes, all of 'em are messed up" that I often get to deal with.
I came across one of the dynamic ad-hoc queries, that took awhile to run. It looks like this:
SELECT tab.* FROM tab WHERE tab.Row_Date Between '5/12/2014 3:00:00 PM' AND '5/12/2014 3:30:00 PM' ORDER BY tab.Row_Date
This took about 3.5 seconds to run(according to profiler output when I was monitoring it yesterday. It took about 280ms to run this morning. The query runs every 15 minutes with a new time filter(for the last 15 minutes, so the literals would be 3:30 and 3:45 the next run.) The end users were working this morning, but I suppose I don't have an exact duplicate of their work that I can replay so maybe I could chalk this up to a difference in workload.
But I rewrote it to see if I could cut that 3.5 seconds down. This is what I came up with:
declare @startdate datetime
declare @enddate datetime
set @startdate=cast('5/12/2014 3:00:00 PM' as datetime)
set @enddate=cast('5/12/2014 3:30:00 PM' as datetime)
--SELECT tab.* FROM tab WHERE tab.Row_Date Between '5/12/2014 3:00:00 PM' AND '5/12/2014 3:30:00 PM' ORDER BY tab.Row_Date
WHERE tab.Row_Date Between @startdate AND @enddate ORDER BY tab.Row_Date
The filter column is typed as datetime in the table, so I tried to eliminate the implicit conversion I expected sql server to perform in the filter by swapping the char literals out with datetime vars, expanded the tab.* to an explicit column list.
I figured this should finish a little faster. Maybe not by much, but since it runs 24/7 on numerous servers I figured why not, it's a trivial rewrite, and that might be offset by overall efficiency improvements across all servers, right?
Except it didn't. Even after a few executions, it was rarely faster, on the few runs it was faster it was only by a few ms, and sometimes it was notably slower (like doubling the execution time) to complete.
Here are the execution times from the last few runs-
Q1: 224ms, 126ms, 124ms
Q2: 287ms, 238ms, 235ms
So I checked the plans for each, and I found that:
The first query executes by
index seek operation
bookmark lookup operation
and doesn't have a sort operation in it's plan at all.
The second query executes by
table scan operation
which is (roughly) the plan I expected (maybe I expected an index seek instead of a table scan, but I know that sometimes the optimizer works out that a scan is cheaper and uses that operation instead, so fair enough.)
The tab object above is a heap with 1.9 million rows and has two nonclustered indexes. The first nonunique index includes only the row_date column, the second nonunique index includes all other columns, but does NOT include the row_date column.
All the statistics for the table were rebuilt with a 90% sampling ratio last night.
I'm not worried overmuch about it, but I am still trying to explain how the second query is generally no better, and sometimes worse.
And I know there a lot of variables involved, but does anyone see something obvious that I'm completely overlooking?