Top 1 makes very fast query into very slow query

  • Yes.    Eirikur's version of DelimitedSplit8k is fast AND is an inline table valued function.   Everyone who can't use String_Split should get a copy.   Also, use of an inline table-valued function just may enable the optimizer to develop the Hash Join without the hint.

    Inline table valued functions are like views that accept parameters.    The optimizer knows how to estimate the output from them when building a query plan.     Multiline table valued functions are like a black box to the optimizer.    Scroll over the function in your query plan and look at the estimated costs to see.

     

    __________________________________________________

    Against stupidity the gods themselves contend in vain. -- Friedrich Schiller
    Stop, children, what's that sound? Everybody look what's going down. -- Stephen Stills

  • The Dixie Flatline wrote:

    Yes.    Eirikur's version of DelimitedSplit8k is fast AND is an inline table valued function.   Everyone who can't use String_Split should get a copy.   Also, use of an inline table-valued function just may enable the optimizer to develop the Hash Join without the hint.

    Inline table valued functions are like views that accept parameters.    The optimizer knows how to estimate the output from them when building a query plan.     Multiline table valued functions are like a black box to the optimizer.    Scroll over the function in your query plan and look at the estimated costs to see.

    Ah, good point. I'd been thinking only about the execution speed of the function, and hadn't considered that a different function might cause the query optimizer to sober up. I'll give it a try. Thank you.

  • pdanes wrote:

    The Dixie Flatline wrote:

    Yes.    Eirikur's version of DelimitedSplit8k is fast AND is an inline table valued function.   Everyone who can't use String_Split should get a copy.   Also, use of an inline table-valued function just may enable the optimizer to develop the Hash Join without the hint.

    Inline table valued functions are like views that accept parameters.    The optimizer knows how to estimate the output from them when building a query plan.     Multiline table valued functions are like a black box to the optimizer.    Scroll over the function in your query plan and look at the estimated costs to see.

    Ah, good point. I'd been thinking only about the execution speed of the function, and hadn't considered that a different function might cause the query optimizer to sober up. I'll give it a try. Thank you.

    The critical join is between dbo.Podrobnosti and dbo.Akces and thus should not be related to the other JOIN.

    SQL DBA,SQL Server MVP(07, 08, 09) A socialist is someone who will give you the shirt off *someone else's* back.

  • Output of the splitter function is not used anywhere in SELECT, further JOIN or ORDER BY.

    Therefore it's better be not in INNER JOIN but in WHEE EXISTS check:

    SELECT TOP 1 
    P.AkcesAutoID --, A.AkcesitPred, A.Akcesit, A.Rok
    FROMdbo.Podrobnosti P
    INNER JOINdbo.Akces A ON P.AkcesAutoID = A.AkcesAutoID
    WHERE EXISTS (select * FROM WhateverSliptterFunctionYouChoose ('J') Ltrs
    WHERE Ltrs.EvidenceLetter = P.EvidenceLetter)
    Order By A.Rok, A.AkcesitPred, A.Akcesit

    And avoid adding layers of subqueries - it only complicates the job of optimiser, especially if ORDER BY is involved.

    P.S. "Podrobnosti" - how long it's been... 🙂

    • This reply was modified 3 years ago by  Sergiy.

    _____________
    Code for TallyGenerator

  • ScottPletcher wrote:

    pdanes wrote:

    The Dixie Flatline wrote:

    Yes.    Eirikur's version of DelimitedSplit8k is fast AND is an inline table valued function.   Everyone who can't use String_Split should get a copy.   Also, use of an inline table-valued function just may enable the optimizer to develop the Hash Join without the hint.

    Inline table valued functions are like views that accept parameters.    The optimizer knows how to estimate the output from them when building a query plan.     Multiline table valued functions are like a black box to the optimizer.    Scroll over the function in your query plan and look at the estimated costs to see.

    Ah, good point. I'd been thinking only about the execution speed of the function, and hadn't considered that a different function might cause the query optimizer to sober up. I'll give it a try. Thank you.

    The critical join is between dbo.Podrobnosti and dbo.Akces and thus should not be related to the other JOIN.

    But then why does the HASH directive fix the query, no matter where it is placed?

  • When choosing a plan optimiser has to estimate the data sets coming out of each of the objects (tables, views, functions) joined in the query.

    it uses indexes, statistics, to get that evaluation.

    Multi-statement TVF is kinda a black box for optimiser. It has no way to predict the size of the data set coming out of it. So, it makes some wild guesses, based on some generic assumptions. Which might be the best choice in some situations. But not in yours.

    You have advantage to the optimiser of knowing how many rows will be returned by the function, so you may suggest it to use a plan which (you know from experiment) is the best for the data set you anticipate.

    That's why HASH fixes the issue.

    Hope it makes sense.

    _____________
    Code for TallyGenerator

  • Sergiy wrote:

    When choosing a plan optimiser has to estimate the data sets coming out of each of the objects (tables, views, functions) joined in the query.

    it uses indexes, statistics, to get that evaluation.

    Multi-statement TVF is kinda a black box for optimiser. It has no way to predict the size of the data set coming out of it. So, it makes some wild guesses, based on some generic assumptions. Which might be the best choice in some situations. But not in yours.

    You have advantage to the optimiser of knowing how many rows will be returned by the function, so you may suggest it to use a plan which (you know from experiment) is the best for the data set you anticipate.

    That's why HASH fixes the issue.

    Hope it makes sense.

    It does. I guess it takes the HASH directive into account for the whole query, not just the one JOIN where it is specified.

  • "I guess it takes the HASH directive into account for the whole query, not just the one JOIN where it is specified."

    When you force the HASH join on one table, the optimizer now has more information with which to make it's decisions on other joins.   In this case, it decided another HASH join was appropriate, but don't count on it always doing HASH joins across the board.

    __________________________________________________

    Against stupidity the gods themselves contend in vain. -- Friedrich Schiller
    Stop, children, what's that sound? Everybody look what's going down. -- Stephen Stills

Viewing 8 posts - 16 through 22 (of 22 total)

You must be logged in to reply to this topic. Login to reply