Query plan - largest cost

  • This query takes about 45 seconds to complete (down from 10+ minutes by adding clustered index to table). Looking at the query plan, the largest cost is a clustered index seek -- which, as I understand it, is what I'm supposed to be getting rather than table or index scans.

    What other indexes may be missing to speed this up?

    I know it's a big topic, where should I start reading for more information on how to read & understand a query plan and tune a query.

    DBCC FREEPROCCACHE;

    DBCC DROPCLEANBUFFERS;

    GO

    SET STATISTICS IO ON

    SET STATISTICS TIME ON

    SELECT DISTINCT vwea.FranchiseNumber,

    vwea.CustomerID,

    vwea.EmailBefore,

    vwea.EmailAfter,

    vwea.ChangeDate,

    j.ScheduledEndDate AS MoveDate,

    CASE

    WHEN j.ScheduledEndDate < vwea.ChangeDate THEN 'After Move Change'

    ELSE 'Before move change'

    END AS Change

    FROM dbo.tblJob j

    INNER JOIN dbo.tblWorkOrder wo ON j.DatabaseInfoID = wo.DatabaseInfoID

    AND j.WorkOrderID = wo.WorkOrderID

    INNER JOIN dbo.EmailAudit vwea ON wo.CustomerID = vwea.CustomerID

    INNER JOIN dbo.tblDatabaseInfo ON j.DatabaseInfoID = dbo.tblDatabaseInfo.DatabaseInfoID

    INNER JOIN dbo.tblFranchise ON j.DatabaseInfoID = dbo.tblFranchise.DatabaseInfoID

    AND vwea.FranchiseNumber = dbo.tblFranchise.FranchiseNumber

    WHERE (vwea.FranchiseNumber = 253)

    AND j.ScheduledEndDate < vwea.ChangeDate

    ORDER BY vwea.ChangeDate DESC;

    SET STATISTICS IO OFF;

    GO

    SET STATISTICS TIME OFF;

    GO

    Thanks,

    Rob

    (Query plan attached)

  • Looking at this, the first thing I'd suggest is you update your statistics. They're pretty badly out of wack. That might be why you're getting such an expensive plan. You're processing huge amounts of rows to only return 144. Also, I only see five tables the query, but, quick count, eight access points in the plan. Are you dealing with views or nested views here? That could also be problematic.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • I'm updating statistics on those tables right now; dbo.EmailAudit is a view. I'll look into that.

    Thanks,

    Rob

  • The simplification step in the optimization process can clean up issues with views, but not always and not always all the way. Depending on what's there, you might be better off accessing those tables more directly through this query.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • I did an update of the statistics on all of the related tables. That didn't change the execution time.

    The view ([font="Courier New"]EmailAudit[/font]) duplicates many of the tables in the original query. Just running the view takes about 25 seconds.

    The view definition is:

    DBCC FREEPROCCACHE;

    DBCC DROPCLEANBUFFERS;

    GO

    SET STATISTICS IO ON

    SET STATISTICS TIME ON

    --CREATE VIEW [dbo].[EmailAudit]

    --AS

    SELECT COALESCE(f.FranchiseNumber, di.FranchiseNumber) AS FranchiseNumber,

    a.[CustomerID],

    a.[EmailAddress] AS [EmailBefore],

    c.EmailAddress AS [EmailAfter],

    a.[ChangeDate]

    FROM [dbo].[tblEmailAudit] a

    INNER JOIN dbo.tblCustomer c ON C.DatabaseInfoID = a.DatabaseInfoID

    AND c.CustomerID = a.CustomerID

    AND ISNULL(c.FranUniqueID, c.DatabaseInfoID) = ISNULL(a.FranUniqueID, a.DatabaseInfoID)

    LEFT JOIN dbo.tblFranchise f ON a.DatabaseInfoID = f.DatabaseInfoID

    AND ISNULL(f.FranUniqueID, f.DatabaseInfoID) = ISNULL(a.FranUniqueID, a.DatabaseInfoID)

    LEFT JOIN dbo.tblDatabaseInfo di ON a.DatabaseInfoID = di.DatabaseInfoID

    WHERE ISNULL(a.EmailAddress,'') <> ISNULL(c.EmailAddress,'');

    SET STATISTICS IO OFF;

    GO

    SET STATISTICS TIME OFF;

    GO

    So I rewrote the original query removing the view and I'm still getting about the same time (40-45 seconds).

    -- original query rewritten to remove EmailAudit view --

    DBCC FREEPROCCACHE;

    DBCC DROPCLEANBUFFERS;

    GO

    SET STATISTICS IO ON

    SET STATISTICS TIME ON

    SELECT DISTINCT COALESCE(f.FranchiseNumber, di.FranchiseNumber) AS FranchiseNumber,

    a.CustomerID,

    a.EmailAddress AS EmailBefore,

    c.EmailAddress AS EmailAfter,

    a.ChangeDate,

    j.ScheduledEndDate AS MoveDate,

    CASE

    WHEN j.ScheduledEndDate < a.ChangeDate THEN 'After Move Change'

    ELSE 'Before move change'

    END AS Change

    FROM dbo.tblEmailAudit a

    INNER JOIN dbo.tblCustomer c ON c.CustomerID = a.CustomerID

    AND c.DatabaseInfoID = a.DatabaseInfoID

    INNER JOIN dbo.tblWorkOrder wo ON c.CustomerId = wo.CustomerId

    AND wo.DatabaseInfoId = a.DatabaseInfoId

    INNER JOIN dbo.tblJob j ON j.WorkOrderID = wo.WorkOrderID

    AND j.DatabaseInfoID = a.DatabaseInfoID

    INNER JOIN dbo.tblDatabaseInfo di ON di.DatabaseInfoID = a.DatabaseInfoID

    LEFT JOIN dbo.tblFranchise f ON a.DatabaseInfoID = f.DatabaseInfoID

    AND ISNULL(f.FranUniqueID, f.DatabaseInfoID) = ISNULL(a.FranUniqueID, a.DatabaseInfoID)

    WHERE (COALESCE(f.FranchiseNumber, di.FranchiseNumber) = 253)

    AND j.ScheduledEndDate < a.ChangeDate

    AND ISNULL(a.EmailAddress,'') <> ISNULL(c.EmailAddress,'')

    ORDER BY a.ChangeDate DESC;

    SET STATISTICS IO OFF;

    GO

    SET STATISTICS TIME OFF;

    GO

    The two biggest costs in the query plan are both index seeks (one clustered and the other non-clustered). The third largest cost is an index scan on dbo.tblJob.

  • rgtft (9/16/2012)


    The two biggest costs in the query plan are both index seeks (one clustered and the other non-clustered). The third largest cost is an index scan on dbo.tblJob.

    There's really only one way for the seeks to be taking longer than a scan... the seeks are being repeated. Look at the properties of the seeks and see that they are executed many times. A better index capable of doing an initial seek and a "range scan" following that would be a tremendous amount faster.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Silly question, how do I read the execution plan to know what columns to index?

    I created two indexes (just eyeballed query and guessed). The first one seems to have improved the query, but the second one (though being used) still doesn't seem to improve anything.

    -- Two indexes created --

    CREATE NONCLUSTERED INDEX [IX_tblCustomer_CustomerId_EmailAddress] ON [dbo].[tblCustomer]

    (

    [CustomerID] ASC,

    [EmailAddress] ASC

    );

    CREATE NONCLUSTERED INDEX [IX_tboWorkOrder_CustomerID_DatabaseInfoId] ON [dbo].[tblWorkOrder]

    (

    [CustomerID] ASC,

    [DatabaseInfoId] ASC

    );

    Thanks,

    Rob

  • I see a couple of websites that explain some of this:

    I also see that Grant has a book on execution plans:


    Anyways, I've now got query down to 15-20 seconds; it looks like I've got some reading to do.

    Rob

  • ISNULL and COALESCE on columns in the where clause are going to hurt performance. I still think your stats are off. I'm seeing estimated 4.5 million, actual 15,000. That's a pretty big disparity and is usually indicative of out of date or badly sampled statistics.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • Grant Fritchey (9/17/2012)


    ISNULL and COALESCE on columns in the where clause are going to hurt performance....

    This should be close to a SARGable version of the query:

    SELECT DISTINCT COALESCE(f.FranchiseNumber, di.FranchiseNumber) AS FranchiseNumber,

    a.CustomerID,

    a.EmailAddress AS EmailBefore,

    c.EmailAddress AS EmailAfter,

    a.ChangeDate,

    j.ScheduledEndDate AS MoveDate,

    CASE

    WHEN j.ScheduledEndDate < a.ChangeDate THEN 'After Move Change'

    ELSE 'Before move change'

    END AS Change

    FROM dbo.tblEmailAudit a

    INNER JOIN dbo.tblCustomer c

    ON c.CustomerID = a.CustomerID

    AND c.DatabaseInfoID = a.DatabaseInfoID

    and (

    c.EmailAddress <> a.EmailAddress

    OR (c.EmailAddress IS NULL AND a.EmailAddress IS NOT NULL)

    OR (c.EmailAddress IS NOT NULL AND a.EmailAddress IS NULL)

    )

    INNER JOIN dbo.tblWorkOrder wo

    ON c.CustomerId = wo.CustomerId

    AND wo.DatabaseInfoId = a.DatabaseInfoId

    INNER JOIN dbo.tblJob j

    ON j.WorkOrderID = wo.WorkOrderID

    AND j.DatabaseInfoID = a.DatabaseInfoID

    LEFT JOIN dbo.tblDatabaseInfo di

    ON di.DatabaseInfoID = a.DatabaseInfoID

    AND di.FranchiseNumber = 253

    LEFT JOIN dbo.tblFranchise f

    ON (f.FranUniqueID = a.FranUniqueID OR f.DatabaseInfoID = a.DatabaseInfoID)

    --a.DatabaseInfoID = f.DatabaseInfoID

    AND f.FranchiseNumber = 253

    --AND ISNULL(f.FranUniqueID, f.DatabaseInfoID) = ISNULL(a.FranUniqueID, a.DatabaseInfoID)

    WHERE 1 = 1

    --(COALESCE(f.FranchiseNumber, di.FranchiseNumber) = 253)

    AND j.ScheduledEndDate < a.ChangeDate

    --AND ISNULL(a.EmailAddress,'') <> ISNULL(c.EmailAddress,'')

    ORDER BY a.ChangeDate DESC;

    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • Grant Fritchey (9/17/2012)


    ISNULL and COALESCE on columns in the where clause are going to hurt performance. I still think your stats are off. I'm seeing estimated 4.5 million, actual 15,000. That's a pretty big disparity and is usually indicative of out of date or badly sampled statistics.

    I manually ran (below) on all of the associated tables:

    UPDATE STATISTICS dbo.tblCustomers WITH FULLSCAN;

    As I understand it, without specifying an index or statistic, it should update all of them on the table. Or am I off here?

    Thanks,

    Rob

  • ChrisM@Work (9/17/2012)


    Grant Fritchey (9/17/2012)


    ISNULL and COALESCE on columns in the where clause are going to hurt performance....

    This should be close to a SARGable version of the query:

    Chris M:

    Your code is SARGable and faster (aprox. 9 seconds); the only issue is it's bringing back a few thousand records rather than the 144 I expected. I'll look into your code as to why.

    Thanks,

    Rob

  • rgtft (9/17/2012)


    ChrisM@Work (9/17/2012)


    Grant Fritchey (9/17/2012)


    ISNULL and COALESCE on columns in the where clause are going to hurt performance....

    This should be close to a SARGable version of the query:

    Chris M:

    Your code is SARGable and faster (aprox. 9 seconds); the only issue is it's bringing back a few thousand records rather than the 144 I expected. I'll look into your code as to why.

    Thanks,

    Rob

    Hi Rob

    The join on dbo.tblFranchise is complex - I've set it up as a left join for safety. The code that this was cloned from - your last code iteration - showed it as a left join, however it was referenced in the WHERE clause, which would usually flag it up as an inner join. Try it with an inner join - but I suspect it will be a little more complex than that. Returning all the values from dbo.tblFranchise in the result set will help you decide how to filter.

    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

Viewing 13 posts - 1 through 12 (of 12 total)

You must be logged in to reply to this topic. Login to reply