OPTION (MERGE JOIN) "fixes" slow-running queries in SQL Server 2005?

  • The team I work on does nothing but data conversions. We convert data for new customers from their legacy system format into our database format.

    In general, we get their data into a SQL Server database, then use a series of stored procedures to convert this data into the new customer’s production database. A typical project can convert anywhere from tens of thousands of records to 20-30 million records. The SQL code related to the project can run anywhere from 5000-25,000 lines of code, depending on the number of modules being converted. Much of the code can be reused as we go from project to project. Customers may be using either SQL Server 2000 or SQL Server 2005. We run these processes on servers in-house for development and testing, then run them on the customer’s servers. So the processes run on a large variety of hardware configurations.

    When we started using SQL Server 2005, we noticed that occasionally a SELECT query would “hang”. (This would generally be one of the more complex queries containing multiple INNER and OUTER joins.) That is, a query expected to run within a couple of minutes might run for two or three hours without returning any data. Reviewing the data, indexes, server load, etc. revealed no particular problem. Running the query in SQL Server 2000 revealed no problem. We found that often, adding OPTION (MERGE JOIN) to the query would result in the query running as expected, even though the execution plan would show this to be a slower process.

    Later in the process, another query might hang. Adding the OPTION (MERGE JOIN) seemed to resolve the problem more often than not.

    Subsequent runs on the same hardware with the same data might cause queries to hang that had run just fine in a previous test. Likewise, some queries that ran faster after OPTION (MERGE JOIN) had been added will later hang and run better after OPTION (MERGE JOIN) is removed.

    Our team is fairly proficient in tuning queries, and understands how to use indexes and execution plans. We can see that, according to the execution plan, the merge join we are forcing is not as optimal as the hash join SQL Server seems to want to use. So we are at a loss to explain this behavior. We also have not discovered any way to predetermine which queries may have the problem.

    Has anyone else run into this, or does anyone have a plausible explanation as to why this may be happening? Any thoughts would be appreciated.

  • Not knowing much else, I can make too guess. First, the indexes need to be rebuilt or your statistics are out of date. This can lead the optimizer to make poor choices or the correct choices the optimizer makes don't work that well. Also, you could just be seeing an instance where the optimizer is not making the best choice. The optimizer only spends a certain amount of time evaluating plans and decides on the best plan based on cost estimates. It's not always correct. It's just mostly correct.

    I couldn't say anything more specific without seeing the execution plans, the query and possibly the structure.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • I can say with almost certainty that you were winding up with a query plan that included nested loops for LOTS of rows (also possibly on unindexed fields). Your forced merge was faster due to orders of magnitude less I/O (even if a SORT needed to be performed prior to the actual merges). You can avoid this by 1) ensuring proper indexing, perhaps updating stats manually (or maybe creating them in the first place) prior to running large queries 2) avoiding use of table variables, 3) avoiding use of 'bad' things in join/wheres like ORs or large IN clauses, 4) perhaps doing dynamic sql or forcing recompile on the code.

    Best,
    Kevin G. Boles
    SQL Server Consultant
    SQL MVP 2007-2012
    TheSQLGuru on googles mail service

  • You may want to look at forcing the query to recompile each time it executes (either through proc level or statement level recompile option).

  • Hello -- I'm working with much smaller databases for web-based database applications using SQL Server Express 2005. I've been very pleased with it for the last few years but, unfortunately, I am now getting occasional slow-downs whereby the database server seems to be hanging. I suspect it's the database server because I occasionally get the same behavior when running queries in Management Studio... Although I can't find the cause of the problem or a solution -- it only happens occasionally (but it's frustrating when it does!) -- I thought I would mention this here because the symptoms I'm experiencing are the same, but I'm only dealing with small databases! Also, I should mention it's only happening on my "localhost" which runs XP Pro but, fortunately, it's not happening on my live hosting server which runs W2K3 Server Web Edition.

  • office (1/18/2009)


    Hello -- I'm working with much smaller databases for web-based database applications using SQL Server Express 2005. I've been very pleased with it for the last few years but, unfortunately, I am now getting occasional slow-downs whereby the database server seems to be hanging. I suspect it's the database server because I occasionally get the same behavior when running queries in Management Studio... Although I can't find the cause of the problem or a solution -- it only happens occasionally (but it's frustrating when it does!) -- I thought I would mention this here because the symptoms I'm experiencing are the same, but I'm only dealing with small databases! Also, I should mention it's only happening on my "localhost" which runs XP Pro but, fortunately, it's not happening on my live hosting server which runs W2K3 Server Web Edition.

    You should probably start a new topic so that more people see your question and respond to it.

    Real quick, you need to gather data to validate that it is SQL Server causing your slow-downs. You might start with this article[/url]. It's pretty introductory and kind of high level, but I think it'll help. When you do post to the new thread, include as many details as you can so that people who are not sitting at your desk can understand the problem.

    Finally, if you're running a web server and SQL Server side-by-side, you will hit issues. SQL Server is a great piece of software, but it doesn't play well with others, if you know what I mean.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

Viewing 6 posts - 1 through 5 (of 5 total)

You must be logged in to reply to this topic. Login to reply