Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 
        
Home       Members    Calendar    Who's On


Add to briefcase «««1234»»

How SQL Server Chooses the Type of Join Expand / Collapse
Author
Message
Posted Monday, May 07, 2007 3:20 PM
SSC Veteran

SSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC Veteran

Group: General Forum Members
Last Login: 2 days ago @ 1:50 PM
Points: 214, Visits: 360

I am curious to know in reference to this article

a>whether any dynamice sql statement were created within store procedure

b> is it possible to splitting it into multiple small stored procedures and calling them from a single stored procedure

c> are you generating any Errors from Stored Procedures(raise error) or try-catch in case sql serer 2005 to return to application

d> Does the enduser as that permission to call this store procedure & is the owner of this store procedure & how it     is calling, is it  calling with  full qualify name

e> if store procedure is not called by qualifying the owner name, SQL Server enters the compile code and acquires a COMPILE lock on the procedure & determines that a new plan is not required assuming no other reasons apply, so it does NOT recompile the plan at this point due to the lack of qualification. However, the extra step of getting a COMPILE lock on the procedure can cause blocking contention in severe situations

f> are you using parameterization in your store prodecure for sql statement & which are executed using sp_executesql.

g> KEEP PLAN option is helpfull if you are creating any temp table, as no plan is generated untill they are created. Microsoft says that to ensure that the optimizer uses the best plan in all cases concerning temporary tables, a special algorithm was developed to be more aggressive with recompilations.The algorithm states that if a temporary table created with a stored procedure has changed more than six times, the procedure will be recompiled when the next statement references the temporary table. KEEP PLAN does not prevent recompilations altogether, it simply prevents those caused by more than six changes to temporary tables referenced in the procedure.

h> The following five SET options are set to ON by default:
ANSI_DEFAULTS
ANSI_NULLS
ANSI_PADDING
ANSI_WARNINGS
CONCAT_NULL_YIELDS_NULL
If you execute the SET statement to set any of these options to OFF, the stored procedure will be recompiled every time it runs. The reason for this is that changing these options may affect the query result that triggered the recompilation. using the option of KEEP PLAN here won't help to avoid the recompilation because the cause of the recompilation is from the SET statement.

i> The following table lists some common SET statements and whether or not changing the SET statement in a stored procedure causes a recompile:
Set Statement Recompile
Set quoted_identifier No
Set arithabort Yes
Set ansi_null_dflt_on Yes
Set ansi_defaults Yes
Set ansi_warnings Yes
Set ansi_padding Yes
Set concat_null_yields_null Yes
Set numeric_roundabort No
Set nocount No
Set rowcount No
Set xact_abort No
Set implicit_transactions No
Set arithignore No
Set lock_timeout No
Set fmtonly No

The reason i focused on recompile is that as you menioned that your store procedure is using WITH RECOMPILE clause,while above specified steps is implictily recompling procedure cache which i thing is double overhead. as i hope the above information might have already been considered while troubleshooting store procedure. i just try to add some more additional tips if that helps.

Thanks

Jay

 


 

 

Post #363825
Posted Monday, May 07, 2007 4:02 PM
Grasshopper

GrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopper

Group: General Forum Members
Last Login: Sunday, July 26, 2009 7:03 PM
Points: 13, Visits: 20

I've just finished a(nother) SQL course and this exact senario was discussed with both of these solutions proposed. i.e. using WITH RECOMPILE or using an itermediate SP to decide which final SP to call, based on the data range. We have this problem with one of our apps at the moment.

Has anyone used the DBCC FREEPROCCACHE call? Say using it on a nightly or weekly basis from a job? This would force a recompile of all procs

Leo

Post #363835
Posted Monday, May 07, 2007 10:01 PM
SSC Veteran

SSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC Veteran

Group: General Forum Members
Last Login: Saturday, February 02, 2013 8:21 AM
Points: 283, Visits: 268
In my experience with running Queries in .NET apps vs. Query Analyzer is that most don't actually call the Procedures or Ad-hac queries in the same way. For example, it IS a best practice to always use a "parameterized command" in .NET, this way you code will hit the same Cached Plan REGARLESS if it's ad-hoc or a Stored Procedure being called.

So, imagine I have a .NET app that is using a Parameterized Command. And on the 2nd call it retrieves a much bigger data set than the 1st call. In this case, it will use the same cached plan because the only thing changing is the command parameter values. Howedver, in Query Analyzer, most don't take the time to actually declare "variables" for their ad-hocs there, and instead call the Procs by explicitly setting values on it.

Example:
SqlCommand command = new SqlCommand;
command.CommandText = "spGetData";
command.CommandType = StoredProcedure;
command.Parameters.Add(...);
command.ExecuteNonQuery();
// This execution always hits the same plan

-- Now for Query Analyzer
Exec spGetData '01/01/2006', '01/01/2007
-- This will NOT use the same plan if other dates are used
-- because the parameters are changing

Declare @Date1 DateTime, @Date2 DateTime
Select @Date1 = '01/01/2006', @Date2 = '01/01/2007'
Exec spGetData @Date1, @Date2
Select @Date1 = '01/01/2006', @Date2 = '01/01/2008'
Exec spGetData @Date1, @Date2
-- This WILL use the same query plan and could potentionally be slower...

All I'm saying is that you should be careful how you are testing the speed and query plans. You can't blame a .NET everytime something is slow. I am agreeing with the "WITH RECOMPILE" option, but I also might try to prepend a "Merge, Hash, or Nested" Join clause in my joins if I could find an optimal solution that way.



Post #363885
Posted Tuesday, May 08, 2007 1:57 AM
SSC Rookie

SSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC Rookie

Group: General Forum Members
Last Login: Thursday, September 20, 2007 7:40 AM
Points: 39, Visits: 1
very useful as I'm struggling with exactly the same problem:

a storeed procedure that joins two tables based on a selection from one table that can return as few as 50 rows or as many as 2 million!

I will test this and see how it runs.




Post #363907
Posted Tuesday, May 08, 2007 2:45 AM
SSC-Enthusiastic

SSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-Enthusiastic

Group: General Forum Members
Last Login: Friday, February 28, 2014 1:54 AM
Points: 171, Visits: 229

I had a similar problem with a query that was being run in a ms access front end vs. me trying it directly on sql server. Access was taking 30 seconds, sql server was taking <1 second.

Turned out when the access called sql was being run, sql server wasnt using an index, which it was when run directly in query analyser (I did some background checking and looking at query plans to find this out).

I added a index hint to the access sql on a field/index in question to force its use, and BANG, <1 second in my access version.

martin

Post #363909
Posted Tuesday, May 08, 2007 4:34 AM
SSC Veteran

SSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC Veteran

Group: General Forum Members
Last Login: Monday, February 09, 2009 7:44 PM
Points: 282, Visits: 37
Good Article Mike.  Well done!

Cheers,
Dave
Post #363930
Posted Tuesday, May 08, 2007 7:03 AM
SSC Rookie

SSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC Rookie

Group: General Forum Members
Last Login: Wednesday, September 26, 2012 12:03 PM
Points: 28, Visits: 38
Thanks for all of the feedback. I think the with recompile option is probably the best choice looking back. The hints thing didn't work because you really didn't know the size of the tables. Just because it was a large date range didn't mean it was a larger data set. I wanted and trusted the server to choose the right plan, which it did when it was recompiled. I also want to clear up that I wasn't blaming .Net. I was just stating what we saw, but the points that have been raised are valid and should be considered when trouble shooting. For me, the big thing here was discovering how and why the engine chose the type of join it did, and what that join was best at. The difference between query analyzer and the .Net application just help to shed some light on it. I am pleased with all of the discussion that has taken place and all of the wonderful ideas that have come from the discussion. Thanks for keeping it positive and adding value to the discussion.

-mike
Post #363975
Posted Tuesday, May 08, 2007 3:08 PM
Forum Newbie

Forum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum Newbie

Group: General Forum Members
Last Login: Monday, November 10, 2008 3:01 PM
Points: 1, Visits: 6

This does happen on occasion with any SQL server installation but the main reason actually is statistics.  In fact it is more than likely the sample level of the statistics.  What is happening is that the sample rate is probably at the default and on a large table (few million rows or more) this default rate is not adequate for the amount of data you have therefore SQL is not able to accurately determine the amount of rows that the input vars are going to fetch therefore it is using the old plan.

 Try something.

1) Take this same procedure and duplicate the issue with OUT issuing a recompile in the proc.  Once you can confirm that you know exactly how to duplicate the issue look at the execution plan and the estimated row count for each (the fast query and the slow query).  Run the proc without the recompile in both cases and see how close it is.  What you will most likely find is that the version that takes longer than it should will have the highest delta between the estimated row count and the actual which is typically directly related to the fact that the statistic sample rate is not high enough for the tables involved.

2) Now update the statistics on the tables involved, (update the stats for each index on every table used as well as the system statistics on the fields in those tables) and force it to use the FULL Scan sample rate.  Perform the same test in step 1 again and you should see that the execution plan will change and the estimated row count is more accurate.

DBCC SHOW_STATISTICS ( Table_Name_Here , Index_Name_Here ) --run this and look at the Rows Sampled field and if this is a few million row table you will see that it is not actually sampling very much. In my case I did this on "small" 2 million row table and the rows sampled was 81k which equates to about 4% or so.

Thanks..Richard

 

Post #364218
Posted Wednesday, May 09, 2007 2:47 AM


SSCrazy

SSCrazySSCrazySSCrazySSCrazySSCrazySSCrazySSCrazySSCrazy

Group: General Forum Members
Last Login: Thursday, April 03, 2014 3:48 AM
Points: 2,274, Visits: 767
rather than looking at recompiles and statistics, how about looking at your joins themselves.

would it be possible to have 2 versions of the proc - each with distinct join hints

one usign loop joins and the other using merge or hash joins. this would avoid the constant recompiles.



MVDBA
Post #364300
Posted Wednesday, May 09, 2007 6:07 AM
SSC Veteran

SSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC Veteran

Group: General Forum Members
Last Login: Friday, December 06, 2013 6:52 AM
Points: 298, Visits: 332

If the statistics were "bad", or insufficiently sampled, why would running the query with the recompile option help?  You'd be making a new query plan with the same bad or incomplete information, and I would think you'd come up with the same ill-chosen query plan.

I believe RichardReynolds is correct in concept though.  If your table is not made up of homogenous data, you need to sample at a much higher rate than default.  Because I believe that's true, we rebuild all indexes each week (on Sunday where we have very few users online).  That gives us a complete statistics sample every week.

Still, we see the same problem described here.  Even with proper statistics, once the query plan has been built and cached for the first set of data it encounters, it seems to be reused even if statistics would show that it should be be reevaluated for another set of parameters for the same query.  We use the option recompile for several such Stored Procedures.

My opinion is, that the query plan does not get automatcially recompiled when a quick glance at the spread of data for a given set of parameters would indicate that it should.  In my mind, this is in the "bug" category.

 




Student of SQL and Golf, Master of Neither
Post #364325
« Prev Topic | Next Topic »

Add to briefcase «««1234»»

Permissions Expand / Collapse