Why when I use variables on SQL the Database Engine avoid the indexes?

  • I'm improving our SSIS process passing the variables through a "declare" and I found this:

    Here you have the plan execution using variables brentozar.com/pastetheplan/?id=rkSxs7VPv

    And here without variables brentozar.com/pastetheplan/?id=SJ8soXEww

    As you can see, when I use the "declare" SQL Database Engine don't use the Index. The index used is an nonclustered for the column used on the "where"

    Both executions were on the same server, with the same set options and at the same time executed.

    Thanks!

  • In each plan, the estimated number of rows is wildly different from the actual number, which makes me think either that your statistics are out of date, or that you've got parameter sniffing going on (you previously ran the queries with different parameter values).  Try updating the statistics on the fact_Venta table (or, if it's a view, the tables beneath it).  That will also cause the plans to be evicted from the plan cache, and you'll start with a fresh plan that will hopefully be appropriate to the arguments you pass in.

    John

  • Read about "parameter sniffing"

    very common issue.

    _____________
    Code for TallyGenerator

  • Hi @John-Mitchell-245523

    What are you telling is just that in brentozar's script represent incorrect the result and I can asure you both querys return 20354 rows. Both querys have the same values to filter.

    Thanks for you observation

     

    Hi @sergiy

    I'll search about "parameter sniffing".

    Thanks you for you apportation

    • This reply was modified 3 years, 6 months ago by  ngonzalez.
    • This reply was modified 3 years, 6 months ago by  ngonzalez.
  • ngonzalez wrote:

    Hi @John-Mitchell-245523

    What are you telling is just that in brentozar's script represent incorrect the result and I can asure you both querys return 20354 rows. Both querys have the same values to filter.

    Thanks for you observation

    You're welcome.  The actual number of rows is indeed the same in each plan, but the estimated number certainly isn't.  Since the query optimizer doesn't have the actual numbers available to it at compile time, it's the estimated number that's important, and that's why you have two different plans.  It also isn't important (in terms of what execution plan will look like) what arguments you last passed to the two queries - it's the values you used the first time you ran the queries, because that's when the plans would have been compiled.  You'll learn all that, and more, and probably explained better, when you read about parameter sniffing.

    John

  • Hi @John-Mitchell-245523

    Yes, I read about it, and I rereading your post to understand better and I saw how:

    • Without parameters, the estimated numbers of rows is 1 (but the real was 20354) so, in this case, it will use the index
    • With parameters, the estimated number of rows is 7299000 (and real 20354) so, in this case, it wouldn't use the index

    Maybe I'm so noob with index and sql (maybe because I'm a robotic engineer changing my workjob to BI xD), but I don't understand why, if the estimated numbers of rows goes so high, don't use index.

    In any case, meanwhile I don't find any better solution, I'll use "WITH (INDEX(IN01_fact_Venta_Data_ID))" to force the query with the parameters to use index

    Thanks!

    • This reply was modified 3 years, 6 months ago by  ngonzalez.
  • If the estimated number of rows is high enough, then the query optimizer will judge that it's more efficient to scan the whole clustered index than it is do lookups from the non-clustered index for each value.  That's the difference between your two plans.

    The trouble with forcing use of the index is that it'll be used even when it's not appropriate for the arguments provided.  Kimberly Tripp has written at length about how to overcome parameter sniffing issues - I would recommend that you check that out.  Meanwhile, have you updated your statistics as I suggested earlier?

    John

  • Hi @John-Mitchell-245523

    Ok about index (Y)

    About forcing it, as I saw use paramenter dont use index what I need, I gonna see what index use a normal query and I'll put on the query with parameters. There are static querys which just change the date value because are used on SSIS process, so que whole query will continue being the same.

    I'll read the information about Kimberly Tripp.

    I've just updated and just change a bit the estimated results of the query without parameters, the other continue equal, so both plan execution don't change

    Thanks!

  • Just for more information, i won't force index. Instead of this, I'll use "option(recompile)" what I read that avoid the parameter sniffing and I test that it work 😀

  • Wait, are we talking parameters or variables as stated in the original question? They're different and result in differences in estimated row counts.

    You've already received the scoop on parameters. The actual value is used to get an estimate from the statistics on the given number of rows to be returned. Parameter sniffing in action.

    However, with variables, things are different. Unless we're talking about a recompile situation, more on that in a moment, the nature of a variable is such that the optimizer doesn't know what value it contains. So, instead of using a precise value against the statistics, as it does in parameter sniffing, it uses an average value from the stats. So, you're going to see radically different row estimates for a variable than you will see for a parameter. Those row estimates for a variable may lead to that index being ignored by the optimizer whereas, with a parameter, it'll be used when you have a parameter because the row estimates are different.

    Now, in a statement level recompile event, now, the optimizer knows what value the variable contains. Here, you'll find, it acts more like a parameter and precise row counts for a given value is used. However, this is only evident in a statement-level recompile. Not in an initial compile and not in a batch or procedure level recompile either. Statements only can lead to variable sniffing.

    Hope that clarifies things a little.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • Spot on, Grant.  Another thing that the use of variables affects is actually whether or not even properly setup "Minimal Logging" will occur.  If you're trying to do Minimally Logged inserts into a fresh new table and your code to do so contains a variable, you're going to have to use a statement level OPTION (RECOMPILE) to get the Minimal Logging to actually occur.  That issue is NOT documented anywhere that I can find in the MS documentation.

    I've not yet tested that issue against parameters and, as you've so very adequately pointed out, it can and frequently does make a huge difference.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 11 posts - 1 through 10 (of 10 total)

You must be logged in to reply to this topic. Login to reply