How SQL Server Chooses the Type of Join

  • If sp_recompile is placed inside of stored procedure it will actually recompile the next time it runs. From here, for the current execution it uses the plan compiled at previous run. So it's still unclear how it solved the problem. As it was mentioned by other members above, to use WITH RECOMPILE clause would be more beneficial.

    Anoter benefit of using WITH RECOMPILE is that it instructs engine not to place execution plan in procedure cache, thereby saving on overhead of placing it there.

     

  • Steve Jones

    Thanks for your response

    You said "They didn't have a good way to fix it other than flush or restart the server" That is part of my problem once the SP slows down I can not do what you suggest because my site is hosted so I do not have authorization to flush or restart. This is actually the second time this has happened. The first time the hosting service restarted the server fixing the problem. I reworked all my SP to try and improve them. Everything worked fine for about 5 months then the SP's slowed down again.

    I've tried:

    - Dropping Tables, Stored Procedures, Indexes

    - Updating statistics

    - Reworking SP's

    - Forcing SP recompile with (sp_recompile and WITH RECOMPILE)

    The Only thing that worked once was for the Hosting service to restart the server and I don't think they will do that again. How can I force all of my cache to clean out so I can start the cache process from scratch ?

    Thanks in advance

  • What really helped me from the article was the info about hash joins vs nested loop joins.  I wound up doing some more research and I was able to put some hints on a few of our key joins and it sped up the procedure from 7 seconds to 3.  During testing I found that the hash joins were slower than the nested loop joins so I tried merge joins and voila! 

    Here's a good link on joins

    http://blogs.msdn.com/craigfr/archive/2006/08/16/702828.aspx

  • Frank:

    I used to have problems like this too.  I kept scripts for all my views and stored procedures.  The script has a drop statement at the top followed by the create.  Any time anything changed (add an index, add a column, etc) I would merge all my little scripts into one big one and run it.  All new execution plans in the cache.

    Brute force, I know, but you have to do what you have to do.

    ATBCharles Kincaid

  • Charles Kincaid

    Thanks for your response

    That is what I thought. I have run Drop/Creates but is does not seem to make a difference I must be doing something wrong. I don't mind brute force as long as it works. All this stuff is a learning experience so I wouldn't be surprised if I did it wrong. I'll try again.

  • I am curious to know in reference to this article

    a>whether any dynamice sql statement were created within store procedure

    b> is it possible to splitting it into multiple small stored procedures and calling them from a single stored procedure

    c> are you generating any Errors from Stored Procedures(raise error) or try-catch in case sql serer 2005 to return to application

    d> Does the enduser as that permission to call this store procedure & is the owner of this store procedure & how it     is calling, is it  calling with  full qualify name

    e> if store procedure is not called by qualifying the owner name, SQL Server enters the compile code and acquires a COMPILE lock on the procedure & determines that a new plan is not required assuming no other reasons apply, so it does NOT recompile the plan at this point due to the lack of qualification. However, the extra step of getting a COMPILE lock on the procedure can cause blocking contention in severe situations

    f> are you using parameterization in your store prodecure for sql statement & which are executed using sp_executesql.

    g> KEEP PLAN option is helpfull if you are creating any temp table, as no plan is generated untill they are created. Microsoft says that to ensure that the optimizer uses the best plan in all cases concerning temporary tables, a special algorithm was developed to be more aggressive with recompilations.The algorithm states that if a temporary table created with a stored procedure has changed more than six times, the procedure will be recompiled when the next statement references the temporary table. KEEP PLAN does not prevent recompilations altogether, it simply prevents those caused by more than six changes to temporary tables referenced in the procedure.

    h> The following five SET options are set to ON by default:

    ANSI_DEFAULTS

    ANSI_NULLS

    ANSI_PADDING

    ANSI_WARNINGS

    CONCAT_NULL_YIELDS_NULL

    If you execute the SET statement to set any of these options to OFF, the stored procedure will be recompiled every time it runs. The reason for this is that changing these options may affect the query result that triggered the recompilation. using the option of KEEP PLAN here won't help to avoid the recompilation because the cause of the recompilation is from the SET statement.

    i> The following table lists some common SET statements and whether or not changing the SET statement in a stored procedure causes a recompile:

    Set Statement Recompile

    Set quoted_identifier No

    Set arithabort Yes

    Set ansi_null_dflt_on Yes

    Set ansi_defaults Yes

    Set ansi_warnings Yes

    Set ansi_padding Yes

    Set concat_null_yields_null Yes

    Set numeric_roundabort No

    Set nocount No

    Set rowcount No

    Set xact_abort No

    Set implicit_transactions No

    Set arithignore No

    Set lock_timeout No

    Set fmtonly No

    The reason i focused on recompile is that as you menioned that your store procedure is using WITH RECOMPILE clause,while above specified steps is implictily recompling procedure cache which i thing is double overhead. as i hope the above information might have already been considered while troubleshooting store procedure. i just try to add some more additional tips if that helps.

    Thanks

    Jay

     

     

     

  • I've just finished a(nother) SQL course and this exact senario was discussed with both of these solutions proposed. i.e. using WITH RECOMPILE or using an itermediate SP to decide which final SP to call, based on the data range. We have this problem with one of our apps at the moment.

    Has anyone used the DBCC FREEPROCCACHE call? Say using it on a nightly or weekly basis from a job? This would force a recompile of all procs

    Leo

  • In my experience with running Queries in .NET apps vs. Query Analyzer is that most don't actually call the Procedures or Ad-hac queries in the same way. For example, it IS a best practice to always use a "parameterized command" in .NET, this way you code will hit the same Cached Plan REGARLESS if it's ad-hoc or a Stored Procedure being called.

    So, imagine I have a .NET app that is using a Parameterized Command. And on the 2nd call it retrieves a much bigger data set than the 1st call. In this case, it will use the same cached plan because the only thing changing is the command parameter values. Howedver, in Query Analyzer, most don't take the time to actually declare "variables" for their ad-hocs there, and instead call the Procs by explicitly setting values on it.

    Example:

    SqlCommand command = new SqlCommand;

    command.CommandText = "spGetData";

    command.CommandType = StoredProcedure;

    command.Parameters.Add(...);

    command.ExecuteNonQuery();

    // This execution always hits the same plan

    -- Now for Query Analyzer

    Exec spGetData '01/01/2006', '01/01/2007

    -- This will NOT use the same plan if other dates are used

    -- because the parameters are changing

    Declare @Date1 DateTime, @Date2 DateTime

    Select @Date1 = '01/01/2006', @Date2 = '01/01/2007'

    Exec spGetData @Date1, @Date2

    Select @Date1 = '01/01/2006', @Date2 = '01/01/2008'

    Exec spGetData @Date1, @Date2

    -- This WILL use the same query plan and could potentionally be slower...

    All I'm saying is that you should be careful how you are testing the speed and query plans. You can't blame a .NET everytime something is slow. I am agreeing with the "WITH RECOMPILE" option, but I also might try to prepend a "Merge, Hash, or Nested" Join clause in my joins if I could find an optimal solution that way.

  • very useful as I'm struggling with exactly the same problem:

    a storeed procedure that joins two tables based on a selection from one table that can return as few as 50 rows or as many as 2 million!

    I will test this and see how it runs.

  • I had a similar problem with a query that was being run in a ms access front end vs. me trying it directly on sql server. Access was taking 30 seconds, sql server was taking <1 second.

    Turned out when the access called sql was being run, sql server wasnt using an index, which it was when run directly in query analyser (I did some background checking and looking at query plans to find this out).

    I added a index hint to the access sql on a field/index in question to force its use, and BANG, <1 second in my access version.

    martin

  • Good Article Mike.  Well done!

    Cheers,

    Dave

  • Thanks for all of the feedback. I think the with recompile option is probably the best choice looking back. The hints thing didn't work because you really didn't know the size of the tables. Just because it was a large date range didn't mean it was a larger data set. I wanted and trusted the server to choose the right plan, which it did when it was recompiled. I also want to clear up that I wasn't blaming .Net. I was just stating what we saw, but the points that have been raised are valid and should be considered when trouble shooting. For me, the big thing here was discovering how and why the engine chose the type of join it did, and what that join was best at. The difference between query analyzer and the .Net application just help to shed some light on it. I am pleased with all of the discussion that has taken place and all of the wonderful ideas that have come from the discussion. Thanks for keeping it positive and adding value to the discussion.

    -mike

  • This does happen on occasion with any SQL server installation but the main reason actually is statistics.  In fact it is more than likely the sample level of the statistics.  What is happening is that the sample rate is probably at the default and on a large table (few million rows or more) this default rate is not adequate for the amount of data you have therefore SQL is not able to accurately determine the amount of rows that the input vars are going to fetch therefore it is using the old plan.

     Try something.

    1) Take this same procedure and duplicate the issue with OUT issuing a recompile in the proc.  Once you can confirm that you know exactly how to duplicate the issue look at the execution plan and the estimated row count for each (the fast query and the slow query).  Run the proc without the recompile in both cases and see how close it is.  What you will most likely find is that the version that takes longer than it should will have the highest delta between the estimated row count and the actual which is typically directly related to the fact that the statistic sample rate is not high enough for the tables involved.

    2) Now update the statistics on the tables involved, (update the stats for each index on every table used as well as the system statistics on the fields in those tables) and force it to use the FULL Scan sample rate.  Perform the same test in step 1 again and you should see that the execution plan will change and the estimated row count is more accurate.

    DBCC SHOW_STATISTICS ( Table_Name_Here , Index_Name_Here ) --run this and look at the Rows Sampled field and if this is a few million row table you will see that it is not actually sampling very much. In my case I did this on "small" 2 million row table and the rows sampled was 81k which equates to about 4% or so.

    Thanks..Richard

     

  • rather than looking at recompiles and statistics, how about looking at your joins themselves.

    would it be possible to have 2 versions of the proc - each with distinct join hints

    one usign loop joins and the other using merge or hash joins. this would avoid the constant recompiles.

    MVDBA

  • If the statistics were "bad", or insufficiently sampled, why would running the query with the recompile option help?  You'd be making a new query plan with the same bad or incomplete information, and I would think you'd come up with the same ill-chosen query plan.

    I believe RichardReynolds is correct in concept though.  If your table is not made up of homogenous data, you need to sample at a much higher rate than default.  Because I believe that's true, we rebuild all indexes each week (on Sunday where we have very few users online).  That gives us a complete statistics sample every week.

    Still, we see the same problem described here.  Even with proper statistics, once the query plan has been built and cached for the first set of data it encounters, it seems to be reused even if statistics would show that it should be be reevaluated for another set of parameters for the same query.  We use the option recompile for several such Stored Procedures.

    My opinion is, that the query plan does not get automatcially recompiled when a quick glance at the spread of data for a given set of parameters would indicate that it should.  In my mind, this is in the "bug" category.

     


    Student of SQL and Golf, Master of Neither

Viewing 15 posts - 16 through 30 (of 39 total)

You must be logged in to reply to this topic. Login to reply