Certain Value In Where Clause Hangs Query For 5 Minutes

  • I have waited a couple weeks to post this because it's going to sound ridiculous, and because I can't really provide the data here to recreate the problem due to it being sensitive. But I'm desperate and just have to ask...

    Basically, my company uses a program called IBM Cognos ICM, which is sort of a glorified front-end for SQL Server (2005 in our case). In this case it is being used to generate a web report which our employees use. The web report has dropdowns for the user to select a month and a costcenter. It works perfectly fine except for one costcenter. If the user selects that costcenter and ANY month, the report takes about 5 minutes to render its paltry 4 rows, compared with a couple seconds for any other costcenter. But it gets weirder.

    Not that the specifics probably matter, but the costcenter that caused problems 2 weeks ago was 21155. It exhibited this behavior for a couple days. Then was magically fixed for a couple days. Then came back. Then was fixed again. Today we discovered that a new costcenter, 00966, is exhibiting this behavior.

    I had our DBA run SQL Profiler while the report was hanging up so we could capture the SQL that's being run behind the scenes. So I've been able to recreate the problem using pure SQL in all 4 of our environments (different servers). FYI, the DBA also confirmed that all our databases have weekly maintenance performed to rebuild indexes, etc.

    I understand that none of you will be able to test this code for me at all, but I'll include it here on the off chance someone recognizes something familiar that has tripped you up in the past. I have cleaned this up as much as possible. The problem seems to lie in the WHERE clause. If I change the '00966' to any other value, it works fine. If I change the '2015, Month 03' value to any INVALID value (any value that would find no matches) it works fine. But the combination of '00966' and any valid month value hangs it up. Also, removing all the JOINS resolves the problem. So I've removed as many as I can but left the ones that are necessary to recreate the problem.

    SELECT *

    FROM

    (

    SELECT [_Port995].[Months],[_Port995].[CostCenter],[_Port995].[Tier],[_Port995].[MIPCategory],[_Port995].[Value_],[_I1].[ReportDisplay] AS [_Column1]

    FROM

    (SELECT [CostCenter],[Tier],[MIPCategory],[MIPSubcategory],[Months],[Value_] FROM [_Result964]) [_Port995]

    INNER JOIN [MIPSubcategoryConstants] [_I1] ON

    [_Port995].[MIPSubcategory] = [_I1].[MIPSubcategory]

    ) [_Port995]

    INNER JOIN (SELECT [TimeID_],[Name_],[Ending_],[Starting_] FROM [Time_]) [_E1] ON

    [_E1].[TimeID_] = 'CAL' AND

    [_E1].[Name_] = [_Port995].[Months]

    INNER JOIN (SELECT [TimeID_],[Level_],[Starting_],[Ending_],[Name_] FROM [Time_]) [_E2] ON

    [_E2].[TimeID_] = 'CAL' AND

    [_E2].[Level_] = 'Quarters' AND

    [_E2].[Starting_] <= [_E1].[Ending_] AND

    [_E2].[Ending_] >= [_E1].[Starting_]

    INNER JOIN (SELECT [TimeID_],[Level_],[Starting_],[Name_] FROM [Time_]) [_E5] ON

    [_E5].[TimeID_] = 'CAL' AND

    [_E5].[Level_] = 'Months' AND

    [_E5].[Starting_] = [_E2].[Starting_]

    INNER JOIN (SELECT [TimeID_],[Level_],[Ending_],[Name_] FROM [Time_]) [_E6] ON

    [_E6].[TimeID_] = 'CAL' AND

    [_E6].[Level_] = 'Months' AND

    [_E6].[Ending_] = [_E2].[Ending_]

    INNER JOIN (SELECT [Date],[PrevMonth] FROM [DateString]) [_E7] ON

    [_E7].[Date] = [_E2].[Ending_]

    INNER JOIN (SELECT [TimeID_],[Name_] FROM [Time_]) [_E8] ON

    [_E8].[TimeID_] = 'CAL' AND

    [_E8].[Name_] = [_E7].[PrevMonth]

    INNER JOIN

    (

    SELECT [CostCenter],[Months],[Tier],[Value_]

    FROM

    (

    SELECT [CostCenter],[Tier],[MIPCategory],[MIPSubcategory],[Months],[Value_]

    FROM [_Result188]

    ) [_E9]

    ) [_E9] ON

    [_E9].[CostCenter] = [_Port995].[CostCenter] AND

    [_E9].[Months] = [_E5].[Name_] AND

    [_E9].[Tier] = [_Port995].[Tier]

    INNER JOIN

    (

    SELECT [CostCenter],[Months],[Tier],[Value_]

    FROM

    (

    SELECT [CostCenter],[Tier],[MIPCategory],[MIPSubcategory],[Months],[Value_]

    FROM [_Result188]

    ) [_E10]

    ) [_E10] ON

    [_E10].[CostCenter] = [_Port995].[CostCenter] AND

    [_E10].[Months] = [_E8].[Name_] AND

    [_E10].[Tier] = [_Port995].[Tier]

    INNER JOIN

    (

    SELECT [CostCenter],[Months],[Tier],[Value_]

    FROM

    (

    SELECT [CostCenter],[Tier],[MIPCategory],[MIPSubcategory],[Months],[Value_]

    FROM [_Result188]

    ) [_E11]

    ) [_E11] ON

    [_E11].[CostCenter] = [_Port995].[CostCenter] AND

    [_E11].[Months] = [_E6].[Name_] AND

    [_E11].[Tier] = [_Port995].[Tier]

    WHERE

    [_Port995].[CostCenter] = '00966' AND

    [_Port995].[Months] = '2015, Month 03'

    (Sorry it's not very pretty here, it should paste nicely into SSMS.)


    "If I had been drinking out of that toilet, I might have been killed." -Ace Ventura

  • It is almost certainly going to do with parameter sniffing or poor indexing. With no details we can't do much to help.

    http://www.sqlservercentral.com/articles/SQLServerCentral/66909/[/url]

    _______________________________________________________________

    Need help? Help us help you.

    Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.

    Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.

    Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
    Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
    Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
    Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/

  • Thanks Sean, I was very reluctant to post here due to those obvious reasons. But I'm at the end of the road here as far as troubleshooting. Unfortunately indexing is out of our hands as the whole database is controlled by the IBM application.

    Per your link, I am attaching the actual and estimated execution plans here, which I can confirm are identical to the one used for other values. Just in case someone thinks of something.

    Thanks for your understanding.


    "If I had been drinking out of that toilet, I might have been killed." -Ace Ventura

  • Rowcounts in the actual and estimated plans are very different - do you know anything about statistics updates in the database?


    [font="Arial"]Low-hanging fruit picker and defender of the moggies[/font]

    For better assistance in answering your questions, please read this[/url].


    Understanding and using APPLY, (I)[/url] and (II)[/url] Paul White[/url]

    Hidden RBAR: Triangular Joins[/url] / The "Numbers" or "Tally" Table: What it is and how it replaces a loop[/url] Jeff Moden[/url]

  • Thanks for catching that, Chris, I will have to ask our DBA as soon as I get into work today and do some Googling. I'm totally new to the whole DBA side of things, never having worked in a large organization before this job.


    "If I had been drinking out of that toilet, I might have been killed." -Ace Ventura

  • If you look at nodes 15 and 16 the actual and expected rowcounts are very different. The plan doesn't appear to be a good fit for this costcentre as it has many more rows in _Result188 than the optimiser is expecting - 6 million seeks is going to hurt. Once you've ruled out statistics, check out parameter sniffing as Sean recommended. Gail has a couple of great articles on parameter sniffing which are probably linked in the articles Sean posted.

    The query reads the same tables multiple times - you might want to look at refactoring it.


    [font="Arial"]Low-hanging fruit picker and defender of the moggies[/font]

    For better assistance in answering your questions, please read this[/url].


    Understanding and using APPLY, (I)[/url] and (II)[/url] Paul White[/url]

    Hidden RBAR: Triangular Joins[/url] / The "Numbers" or "Tally" Table: What it is and how it replaces a loop[/url] Jeff Moden[/url]

  • I didn't post a link to Gail's articles but here it is. Please notice that there are links in that article to parts 2 & 3.

    http://sqlinthewild.co.za/index.php/2007/11/27/parameter-sniffing/[/url]

    _______________________________________________________________

    Need help? Help us help you.

    Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.

    Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.

    Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
    Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
    Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
    Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/

  • Thanks again, Chris and Sean. I realize this was not the most appealing post to reply to, but your replies mean a lot to me.

    Our DBA updated all the statistics today and it had no effect on the slow-running query. I will re-check the post-statistics-update execution plan on Thursday morning when I'm back in the office. If there are still discrepancies between Estimated and Actual, AFTER updating statistics, does that indicate any other issue that we could attempt to address?

    I will be sure to thoroughly read through the link posted as well, thanks. But unfortunately refactoring (and proper indexing!) is out of the question here as the crappy SQL is generated by the IBM program. 🙁 You wouldn't even believe the original code; the code I posted above is only a fraction of the original after a LOT of work cleaning it up to remove anything that didn't contribute to the hanging problem. Blech.


    "If I had been drinking out of that toilet, I might have been killed." -Ace Ventura

  • Sometimes queries do benefit from covering indexes. Below are the covering indexes I would suggest in this case. I used the prefix CIX to denote that these are ment as covering indexes for a specifc query or set of queries. I cannot test this script because it requires the full table structure to be present. Maybe you can present them to your DBA and ask if he (or she) would give it a try. Simply remove the indexes if they do not improve performance so it is worth a try:

    CREATE NONCLUSTERED INDEX

    [CIX_Result188_CostCenter_Tier_Months]

    ON

    [_Result188]

    ( [CostCenter],

    [Tier],

    [Months] )

    INCLUDE

    ( [MIPCategory],

    [MIPSubcategory],

    [Value_] );

    CREATE NONCLUSTERED INDEX

    [CIX_Time_TimeID_Name]

    ON

    [Time_]

    ( [TimeID_],

    [Name] )

    INCLUDE

    ( [Starting_],

    [Ending_] );

    CREATE NONCLUSTERED INDEX

    [CIX_Time_TimeID_Level_Starting]

    ON

    [Time_]

    ( [TimeID_],

    [Level_],

    [Starting_] )

    INCLUDE

    ( [Ending_],

    [Name] );

  • over a year ago i had similar issues with sql 2005 and cognos.

    although updating statistics helped a bit the problem wasn't fully solved.

    it is very tempting to think "oh i have no influence on the generated query" but

    one experience that i wish to share is that you do have influence on the generated query by cognos but on a different level.

    after some crappy performance and very weird behaviour (and doing the standard things) we went to the cognos developer of the report and reviewed what the person did and designed. changing elements to a different datatype or selecting a different inputbox (with the same end results) could reduce the query speed by over 50%.

    (and also make the query look a lot more simple)

    Unfortunately i can not provide you technical details as i no longer work there ... but you do have influence on what cognos generates. and even if the report is designed outside of the company it can work wonders to contact those people

    (some problems with cognos were such that ibm sended people over from the uk to the netherlands to provide support even)

  • Thanks Eduard, I do understand what you're saying and I agree. But this issue is separate from Cognos being a pile of crap. The issue is reproducible in pure SQL against static tables, even with the Cognos SQL dramatically stripped down.

    I think the disparity between estimated and actual rows is a big red flag. The attached is the Actual Execution Plan AFTER updating all the statistics... I'll be pursuing this further today. 🙂


    "If I had been drinking out of that toilet, I might have been killed." -Ace Ventura

  • I imagine you've already checked this, but for the sake of being thorough, was the plan definitely recompiled (newer creation_time in sys.dm_exec_query_stats, or if run with OPTION (RECOMPILE) is the same plan generated)?

  • Thanks, as far as I know, OPTION (RECOMPILE) is only for stored procedures.


    "If I had been drinking out of that toilet, I might have been killed." -Ace Ventura

  • You're probably thinking of WITH RECOMPILE, which is used with CREATE PROCEDURE.

    OPTION (RECOMPILE) is a query hint.

    Cheers!

  • with those words it gives the feeling you have doubts so i found some more info:

    https://msdn.microsoft.com/en-us/library/ms181714(v=sql.90).aspx

    RECOMPILE

    Instructs the SQL Server 2005 Database Engine to discard the plan generated for the query after it executes, forcing the query optimizer to recompile a query plan the next time the same query is executed. Without specifying RECOMPILE, the Database Engine caches query plans and reuses them. When compiling query plans, the RECOMPILE query hint uses the current values of any local variables in the query and, if the query is inside a stored procedure, the current values passed to any parameters.

    RECOMPILE is a useful alternative to creating a stored procedure that uses the WITH RECOMPILE clause when only a subset of queries inside the stored procedure, instead of the whole stored procedure, must be recompiled. For more information, see Recompiling Stored Procedures. RECOMPILE is also useful when you create plan guides. For more information, see Optimizing Queries in Deployed Applications by Using Plan Guides.

    i checked it to see if it is valid for sql 2005 and to my surprise it is.

    i remember a way to query the query cache and remove the plan so i googled and found the following link:

    http://serverfault.com/questions/91285/how-do-i-remove-a-specific-bad-plan-from-the-sql-server-query-cache

    not sure if it is still usefull after the recompile option

Viewing 15 posts - 1 through 15 (of 18 total)

You must be logged in to reply to this topic. Login to reply