query change gives weird perf results

  • So I've been trying to troubleshoot the "sometimes the app is slow, so sql server indexes, all of 'em are messed up" that I often get to deal with.

    I came across one of the dynamic ad-hoc queries, that took awhile to run. It looks like this:

    Q1:

    SELECT tab.* FROM tab WHERE tab.Row_Date Between '5/12/2014 3:00:00 PM' AND '5/12/2014 3:30:00 PM' ORDER BY tab.Row_Date

    This took about 3.5 seconds to run(according to profiler output when I was monitoring it yesterday. It took about 280ms to run this morning. The query runs every 15 minutes with a new time filter(for the last 15 minutes, so the literals would be 3:30 and 3:45 the next run.) The end users were working this morning, but I suppose I don't have an exact duplicate of their work that I can replay so maybe I could chalk this up to a difference in workload.

    But I rewrote it to see if I could cut that 3.5 seconds down. This is what I came up with:

    Q2:

    declare @startdate datetime

    declare @enddate datetime

    set @startdate=cast('5/12/2014 3:00:00 PM' as datetime)

    set @enddate=cast('5/12/2014 3:30:00 PM' as datetime)

    --SELECT tab.* FROM tab WHERE tab.Row_Date Between '5/12/2014 3:00:00 PM' AND '5/12/2014 3:30:00 PM' ORDER BY tab.Row_Date

    SELECT tab.col1,

    tab.col2,

    ...

    FROM tab

    WHERE tab.Row_Date Between @startdate AND @enddate ORDER BY tab.Row_Date

    The filter column is typed as datetime in the table, so I tried to eliminate the implicit conversion I expected sql server to perform in the filter by swapping the char literals out with datetime vars, expanded the tab.* to an explicit column list.

    I figured this should finish a little faster. Maybe not by much, but since it runs 24/7 on numerous servers I figured why not, it's a trivial rewrite, and that might be offset by overall efficiency improvements across all servers, right?

    Except it didn't. Even after a few executions, it was rarely faster, on the few runs it was faster it was only by a few ms, and sometimes it was notably slower (like doubling the execution time) to complete.

    Here are the execution times from the last few runs-

    Q1: 224ms, 126ms, 124ms

    Q2: 287ms, 238ms, 235ms

    So I checked the plans for each, and I found that:

    The first query executes by

    index seek operation

    ->

    bookmark lookup operation

    ->

    select operation

    and doesn't have a sort operation in it's plan at all.

    The second query executes by

    table scan operation

    ->

    sort operation

    ->

    parallelism(gather streams)

    ->

    select operation

    which is (roughly) the plan I expected (maybe I expected an index seek instead of a table scan, but I know that sometimes the optimizer works out that a scan is cheaper and uses that operation instead, so fair enough.)

    The tab object above is a heap with 1.9 million rows and has two nonclustered indexes. The first nonunique index includes only the row_date column, the second nonunique index includes all other columns, but does NOT include the row_date column.

    All the statistics for the table were rebuilt with a 90% sampling ratio last night.

    I'm not worried overmuch about it, but I am still trying to explain how the second query is generally no better, and sometimes worse.

    And I know there a lot of variables involved, but does anyone see something obvious that I'm completely overlooking?

  • I'm surprised to see parallelism which could be part of the issue.

    Is optimize for ad hoc workloads enabled on the server? This changes how SQL Server handles plan caching in a situation where you get a lot of ad hoc queries like your first example.

    What are the estimated vs. actual row counts in each query plan (actual)?

  • I've tried restricting Q2 to single threaded execution, and that didn't help, it just eliminated the parallelism from the plan and extended the duration time a trivial amount.

    I haven't checked ad-hoc optimization but I will.

    The row est. vs row act. is also funky, but not the Saturday Night Fever way.

    In the case of Q1, the est. rows is something like 3, but the actual is 95.

    In the case of Q2, the est. rows is in the 100K range, but the actual is 95.

    I had assumed that the statistics update job that ran last night would have resulted in the est. vs act. rows being pretty much the same.

  • Well, as far as I can tell, the config option to optimize for ad-hoc workloads wasn't available in Sql Server 2000. But it's been awhile since I'd had to work with Sql 2K, so I may have missed it or missed it in the documentation.

  • tresiqus (5/13/2014)


    Well, as far as I can tell, the config option to optimize for ad-hoc workloads wasn't available in Sql Server 2000. But it's been awhile since I'd had to work with Sql 2K, so I may have missed it or missed it in the documentation.

    Didn't realize you were on SQL 2k, I just assume 2008+ now. I'm not sure optimize for ad hoc workloads was available in 2000 either and I don't have a 2000 box to look at. You could do:

    EXEC sp_configure 'show advanced option', '1';

    GO

    RECONFIGURE;

    GO

    Exec sp_configure

    GO

    EXEC sp_configure 'show advanced option', '0';

    GO

    RECONFIGURE;

    and look to see if it is returned.

    The difference in your row estimates explains the difference in query plans. 100K vs 3 definitely changes the way the optimizer will think about the query. 3 rows definitely would be a seek, lookup, and nested loop joins while 100K would be scan and probably a merge or hash join on the first join.

    Can you post the execution plans?

  • Yeah, it's Sql Server 2000. It doesn't do anything for me, but supposedly they're going to let me migrate the 2k to 2012 later this year, so once that's done it will be nice. There was a big improvement (imho) in the management tooling for Sql Server starting with the 2005 release.

    I'm mostly trying to work out how the rows est. vs actual is so dramatically off after stats were updated with a 90% sample.

Viewing 6 posts - 1 through 5 (of 5 total)

You must be logged in to reply to this topic. Login to reply