Query times out only in certain circumstances

  • I'm a developer at a small company that has lost its DBA. We have a serious production issue and all the developers, who pretty well know their way around SQL but not much more, are trying to figure it out.

    There is a stored procedure that runs a single query. That query is against a large table, and has a lot of parameters in the where clause and outer joins. On all test and development servers, the query returns quickly always. On the production server, when the query is run from inside SSMS, it returns quickly, no matter what the parameters are. On the production server, when called from the ASP.NET client, the query will return quickly when called for one customer (a particular parameter), and time out when called for another.

    I wonder if the cached query plan for one set of parameters to the extent that other parameters cause a time out? If so, why don't we see that effect when calling the procedure from SSMS?

    This query populates a main page in our product, so it is often-hit and critical.

    Here is the query, with some names modified:

    ALTER PROCEDURE [dbo].[GetIncomingOrders]

    @pCompany int,

    @pBusinessUnit int = NULL, -- BusinessUnitID

    @pStatus int = NULL,

    @pFormType int = NULL

    AS

    DECLARE @Now datetime = getdate()

    SELECT OrderId = o.AGInstanceId

    ,Region = bu.level200businessunitname

    ,Address1 = Address1.PartialValue

    ,Driver = d.Firstname + ' ' + d.Lastname

    ,CustomerName = coalesce(corrfname.PartialValue + ' ', fname.PartialValue + ' ', '') +

    coalesce(corrlname.PartialValue, lname.PartialValue, '')

    ,OrderStatus = CASE

    WHEN o.BeingWorkedByID IS NOT NULL

    THEN 'Order being worked by <b>' + substring(u.FirstName, 1, 1) + u.LastName + '</b>'

    WHEN ur_msgs.FormInstanceId IS NOT NULL

    THEN 'New Message'

    WHEN os.Name = 'Pending'

    THEN 'Pending <br /> <b>' + isnull(wos.OrderStatus, '') + '</b>'

    ELSE

    os.Name

    END

    ,TimeInQueue = CASE

    WHEN datediff(second, o.CreatedDate, @Now) > 86400 THEN

    convert(varchar(5), datediff(second, o.CreatedDate, @Now) / 86400) + ' day(s)'

    ELSE

    convert(varchar(5), datediff(second, o.CreatedDate, @Now) / 3600) + ':' +

    convert(varchar(5), datediff(second, o.CreatedDate, @Now) % 3600 / 60) + ':' +

    convert(varchar(5), datediff(second, o.CreatedDate, @Now) % 60)

    END

    ,SecondsInQueue = datediff(second, isnull(o.NextActionDate, o.CreatedDate), @Now)

    ,NextActionDate = isnull(o.NextActionDate, o.CreatedDate)

    ,HasNotes = CASE WHEN notes.FormInstanceID IS NULL THEN 0 ELSE 1 END

    ,HasMessages = CASE WHEN msgs.FormInstanceId IS NULL THEN 0 ELSE 1 END

    ,HasUnread = CASE WHEN ur_msgs.FormInstanceId IS NULL THEN 0 ELSE 1 END

    FROM dbo.[FormInstances] o (nolock)

    JOIN dbo.OverallOrderStatus os (nolock) ON os.ID = o.OrderStatusID

    JOIN dbo.Forms sf (nolock) ON sf.FormID = o.FormID

    JOIN dbo.Drivers d (nolock) ON d.AGDriverId = o.AGDriverId AND d.AGCompanyID = o.AGCompanyID

    JOIN dbo.BusinessUnitLookup bu (nolock) ON bu.BusinessUnitNumber = o.BusinessUnitNumber

    AND bu.AGCompanyId = @pCompany

    JOIN OrderStatuses WOS ON o.WorkflowOrderStatusId = WOS.OrderStatusID

    LEFT JOIN dbo.ActionSteps step (nolock) on step.ID = o.NextActionStep

    LEFT JOIN ( -- Address 1 designated?

    SELECT FormInstanceID, sfif.PartialValue

    FROM dbo.FormInstanceFields sfif (nolock)

    JOIN dbo.FormFields sff (nolock) ON sff.FormFieldID = sfif.FormFieldID

    JOIN dbo.FieldUseTypes t (nolock) ON t.FieldUseTypeID = sff.FieldUseTypeID

    WHERE t.FieldUseType = 'Address 1'

    ) Address1 ON Address1.FormInstanceID = o.FormInstanceID

    LEFT JOIN ( -- Customer First Name designated?

    SELECT FormInstanceID, sfif.PartialValue

    FROM dbo.FormInstanceFields sfif (nolock)

    JOIN dbo.FormFields sff (nolock) ON sff.FormFieldID = sfif.FormFieldID

    JOIN dbo.FieldUseTypes t (nolock) ON t.FieldUseTypeID = sff.FieldUseTypeID

    WHERE t.FieldUseType = 'Customer First Name' AND sfif.PartialValue <> ''

    ) fname ON fname.FormInstanceID = o.FormInstanceID

    LEFT JOIN ( -- Corrected First Name designated?

    SELECT FormInstanceID, sfif.PartialValue

    FROM dbo.FormInstanceFields sfif (nolock)

    JOIN dbo.FormFields sff (nolock) ON sff.FormFieldID = sfif.FormFieldID

    JOIN dbo.FieldUseTypes t (nolock) ON t.FieldUseTypeID = sff.FieldUseTypeID

    WHERE t.FieldUseType = 'Corrected First Name' AND sfif.PartialValue <> ''

    ) corrfname ON corrfname.FormInstanceID = o.FormInstanceID

    LEFT JOIN ( -- Customer Last Name designated?

    SELECT FormInstanceID, sfif.PartialValue

    FROM dbo.FormInstanceFields sfif (nolock)

    JOIN dbo.FormFields sff (nolock) ON sff.FormFieldID = sfif.FormFieldID

    JOIN dbo.FieldUseTypes t (nolock) ON t.FieldUseTypeID = sff.FieldUseTypeID

    WHERE t.FieldUseType = 'Customer Last Name' AND sfif.PartialValue <> ''

    ) lname ON lname.FormInstanceID = o.FormInstanceID

    LEFT JOIN ( -- Corrected Last Name designated?

    SELECT FormInstanceID, sfif.PartialValue

    FROM dbo.FormInstanceFields sfif (nolock)

    JOIN dbo.FormFields sff (nolock) ON sff.FormFieldID = sfif.FormFieldID

    JOIN dbo.FieldUseTypes t (nolock) ON t.FieldUseTypeID = sff.FieldUseTypeID

    WHERE t.FieldUseType = 'Corrected Last Name' AND sfif.PartialValue <> ''

    ) corrlname ON corrlname.FormInstanceID = o.FormInstanceID

    LEFT JOIN ( -- messages?

    SELECT DISTINCT FormInstanceId

    FROM dbo.FormInstanceMessages (nolock)

    ) msgs on msgs.FormInstanceId = o.FormInstanceID

    LEFT JOIN ( -- unread messages?

    SELECT DISTINCT FormInstanceId

    FROM dbo.FormInstanceMessages (nolock)

    WHERE MessageRead = 0

    ) ur_msgs on ur_msgs.FormInstanceId = o.FormInstanceID

    LEFT JOIN ( -- notes?

    SELECT DISTINCT FormInstanceID

    FROM dbo.FormInstanceNotes (nolock)

    ) notes ON notes.FormInstanceID = o.FormInstanceID

    LEFT JOIN dbo.Users u (nolock) ON u.ID = o.BeingWorkedByID

    LEFT JOIN ( -- Ignore duplicates in ServiceabilityCheckOrders.

    SELECT DISTINCT FormInstanceID

    FROM dbo.ServiceabilityCheckOrders (nolock)

    ) sc on SC.FormInstanceId = o.FormInstanceID

    WHERE (bu.BusinessUnitID = isnull(@pBusinessUnit, bu.BusinessUnitID)

    or bu.level300id = isnull(@pBusinessUnit, bu.level300id)

    or bu.level200id = isnull(@pBusinessUnit, bu.level200id)

    or bu.level100id = isnull(@pBusinessUnit, bu.level100id))

    AND sf.FormId = isnull(@pFormType, sf.FormId)

    AND o.AGCompanyId = @pCompany

    AND sf.ShowsInQueue = 'true'

    AND isnull(SC.FormInstanceId, 0) = 0 -- Exclude Serviceability Checks.

    -- Show only New or Pending orders in the queue UNLESS they have Unread Messages.

    AND (os.Name IN ('New', 'Pending') OR ur_msgs.FormInstanceId IS NOT NULL)

    AND o.OrderStatusId = isnull(@pStatus, o.orderStatusId) -- Status drop-down filters on New or Pending.

    ORDER BY o.CreatedDate DESC

    -- Each time the order queue refreshes it should also check for unassociated messages in the message queue.

    -- Check for unassociated messages for the given company.

    IF (EXISTS (SELECT *

    FROM dbo.[Messages] M (nolock)

    LEFT JOIN dbo.FormInstanceMessages sfim (nolock) on sfim.MessageId = m.MessageId

    WHERE m.NeedsOrderAssociation = 1

    AND M.AGCompanyId = @pCompany

    AND sfim.MessageID IS NULL

    )

    -- See if a new message has come in on a Serviceability Check order removed from the queue.

    OR EXISTS (SELECT *

    FROM dbo.ServiceabilityCheckOrders sco (nolock)

    JOIN dbo.FormInstances sfi (nolock) on sfi.FormInstanceId = sco.FormInstanceId

    JOIN dbo.FormInstanceMessages sfim (nolock) on sfim.FormInstanceId = sco.FormInstanceId

    AND sfim.MessageRead = 0

    WHERE sfi.AGCompanyId = @pCompany

    )

    )

    SELECT 1 as MessagesNeedAssociation

    ELSE

    SELECT 0 as MessagesNeedAssociation

  • Please read the second article in my signature block below regard getting help with performance problems (or something to that effect). It will show you what you need to post to get the best help possible. Beyond that, nothing I can do to help right now.

  • The profound ugliness of that query is somewhat of an indication of possible issues of underlying design problems which always lead to performance problems when the volume increases.

    Someone who knows what they are doing and who understands your application and the business requirements needs to be put to work.

    The probability of survival is inversely proportional to the angle of arrival.

  • Wow... That's a rather large query. A couple of notes. You have NOLOCK all over the place. Yes, that prevents locking, but it can lead to bad data being returned by the queries. Also, you have DISTINCT all over the place. This is an aggregation operation that is very costly.

    In general, I'd reassess what you're trying to do with this query.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • Yeah, the query is awful. I've gone over it, and I can't improve it without changing the business requirements or getting more sanity in the (imported) data itself, both of which are out of my control.

    The DBA who quit had a thing for nolock. It's in every procedure he touched.

  • I haven't been able to find the slow parts of the procedure becuase it's never slow in SSMS, nor in any database except production. We can only reproduce the issue when it's run from the ASP.NET client. Restarting the database server fixes the problem for a few hours, then it resurfaces.

    With no way to reproduce the issue in SSMS or on any test database, we haven't been able to diagnose which parts of the query are slow. We'd have to deploy the altered query to production and remove functionality, which we're not allowed to do.

    The primary table has 220,000 rows. Supporting tables have row counts of 10 - 100.

    How slow? The query either returns immediately or times out (30 second time-out). Once it starts having the issue, one company id (parameter) will return quickly, and the other times out. Each time the fix is to restart the production database server.

    The two company ids have no rows in common in any of the tables referenced. One company accounts for about 90% of the data.

    I wonder if the stored execution plan is so skewed toward one company that querying the other one times out, but that doens't explain why it never times out in SSMS.

  • Sounds like a bad cached plan. Try adding the WITH RECOMPILE option to the store procedure.

Viewing 7 posts - 1 through 6 (of 6 total)

You must be logged in to reply to this topic. Login to reply