which query's exec. costs are better/more efficient

  • Sorry, the ACM group is "SIGMETRIC", not PMG (which appears to be a company in the UK). But I am still sure that there is abither,, research organization, I just cannot remember it ...

    [font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung[font="Arial Black"]
    Proactive Performance Solutions, Inc.
    [/font]
    [font="Verdana"] "Performance is our middle name."[/font]

  • Barry,

    I seem to have struck a nerve, though that was not my intention. I simply found your posts a little verbose and hard to understand. The parts that are specific are not written in language that is accessible to me. I have a feel that much of your terminology has special meaning, but it is lost on me, I'm afraid.

    My point is that I provided a link to the information I found to be persuasive. It is there for anyone to read and come to their own conclusions. I am not Joe Chang and so don't intend to defend his work in detail, but I will do my best to respond to points made, if I feel I have something to add.

    I am not looking to question your decades of experience with a methodology I have no familiarity with, nor am I looking to belittle what is clearly a deeply-held belief on your part.

    I have not yet fully read your latest contribution, but it seems to say that we need to consider both worker time (CPU) and logical reads. That does not seem very far from my comment:

    I tend to give much more weight to properly-measured worker time than to reads. That is not to say that I ignore the number of reads: at the very least, a higher number of logical reads implies a possibly less efficient plan, and that warrants a deeper look. Of course, a higher number of reads necessarily involves higher worker time too. My approach is to give a greater weight to CPU usage, but with an eye on reads as well. Naturally, elapsed time is an important consideration too.

    I think that was just about the only part of my post that you didn't take exception to!

    On a separate point, am I to take it that you accept my points about the test script?

    Finally, have you now read the information in the first link I posted. You said you had not, but that may have changed.

    Paul

    edit: tags *again*

  • RBarryYoung (2/28/2010)


    Worse, though the claim that the we should use CPU instead Logical IO is clearly stated both here, and in Joe's papers, no attempt at justification is made in either place.

    Me


    Of course, a higher number of reads necessarily involves higher worker time

  • I think a concrete example might help illustrate the reasons for my discomfort with logical reads:

    Consider the cost of a KEY lookup versus an RID lookup.

    In the case of the KEY lookup, if the clustered index happens to be eight levels deep, SQL Server will incur eight logical I/Os to find the row.

    On a heap, the RID lookup incurs a single logical I/O.

    Does that mean that lookups on a heap are eight times more efficient than on a table with a clustered index? In some sense, yes, but in terms of practical application, this is clearly a nonsensical result.

    Paul

  • Paul White (3/1/2010)


    I seem to have struck a nerve, though that was not my intention. I simply found your posts a little verbose and hard to understand. The parts that are specific are not written in language that is accessible to me. I have a feel that much of your terminology has special meaning, but it is lost on me, I'm afraid.

    Fair enough. Although I find it very frustrating, I have been told this enough times in my life, by enough smart people whom I really respect (and that includes you Paul), that I accept it as true and as a persistent problem that I have in communication. I promise to try to correct this and represent my points in a more comprehensible form in a follow-up post shortly. My apologies for that.

    My point is that I provided a link to the information I found to be persuasive. It is there for anyone to read and come to their own conclusions. I am not Joe Chang and so don't intend to defend his work in detail, but I will do my best to respond to points made, if I feel I have something to add.

    I suppose that we must see this sort of thing differently. To me, it's all in the nuance of how we present our "Expert Opinions and Recommendations". And humble or not, I do think that many of us on these forums have to accept that that is what our utterances are when we present them definitively and unqualified, and that they carry a certain responsibility as well. The way that I try to mediate this is by trying to distinguish between when I am conveying a message from some other authority that I myself may not be either qualified or inclined to defend ("Microsoft says that setting MAXDOP=1 is a Best Practice", or "I've heard people whom I consider experts say that this is how you should deal with a corrupt database..."), as opposed to recommendations that are either solely mine or that I have adopted from others but which I feel that I understand and agree with well enough to defend as my own position and not just defer to a higher authority. That's how I approach it anyway, not everyone sees it the same way.

    I have not yet fully read your latest contribution, but it seems to say that we need to consider both worker time (CPU) and logical reads. That does not seem very far from my comment:

    Ah, I hadn't realized that "worker time" meant CPU. That does help.

    I tend to give much more weight to properly-measured worker time than to reads. That is not to say that I ignore the number of reads: at the very least, a higher number of logical reads implies a possibly less efficient plan, and that warrants a deeper look. Of course, a higher number of reads necessarily involves higher worker time too. My approach is to give a greater weight to CPU usage, but with an eye on reads as well. Naturally, elapsed time is an important consideration too.

    I think that was just about the only part of my post that you didn't take exception to!

    Yes I do not disagree with this as far as it goes. In fact, I think that it is a good illustration of what the disconnect is between us, and what one of my central concerns in all of this which is that in the Microsoft world (and in large part due to MS, but also because of certain historical factors), there is a much higher level of fuzziness and ambiguity about the terms, purpose and nuance of computer performance concepts and that many/most of the misunderstandings and mis-recommendations come from this. In other words, though in general, we mean the same things with these terms, in specific, we all mean slightly different things by them, or draw slightly different import from someone else's use of the same terms. And this vagueness or imprecision starts out seeming benign and inconsequential,

    but when it gets applied in chains of reasoning in attempts to draw conclusions from them, this "vagueness of meaning" is the hole through which the elephant get pushed, so to speak.

    Its frustrating to me, because of how hard it is to show or explain to others what seems so obvious from this particular technical perspective. Perhaps the best analogy I can draw is with SQL Server and the folks who come here for help, with no idea what "relational" or "Normalization" is. And though we may try to explain it to them, and thus why their particular design may be bad or even a nightmare, it is hard as heck to convince them because they have none of the context, experience or understanding that we do. To them, it seems like we just want to cram our way of doing things down their throats merely because it is part of our belief system. The points we make seem arcane and unfounded ("Whats wrong with putting all of the yes/no answers in a single string field separated by commas? Its easier and why should I care about all of this 'relational' mumbo-jumbo?"), and our claims of simplicity and ease of maintenance seem frankly insane to them. And usually we just give up and say things like "trust us, we know what we're doing", or "look, this is how you should do it, just ask any other expert in our field" or "Codd/Date/Microsoft, says so". Because ultimately, trying to explain in a forum what Relational Theory is, and why it "proves" that they are wrong, and why it is not "just a theory", or "just another viewpoint", is an almost Sisyphean task.

    Well, Performance Theory is like that (except that it doesn't even have a consistent name that I can point to). Nonetheless I will try to do that both intelligibly and in as condensed a form as I can.

    On a separate point, am I to take it that you accept my points about the test script?

    I'm not sure if you mean Joe's tests, or my script? If you mean Joe's test, then I have no disagreement with them, only in how they are interpreted and applied. If you mean my script and the whole CHECKPOINT, before DBCC DROPCLEANBUFFERS business, I haven't really thought about it much, but I am willingly to agree. My concerns are that it reliably demonstrate what it is intended to demonstrate, but at the same time, not to anything terrible to a production system, if someone tries it there. I suppose you are right, CHECKPOINT is no worse than DROPCLEANBUFFERS already is, but that just takes me back to whether I should have DROPCLEANBUFFERS in it then... hhrrrrrm...

    Finally, have you now read the information in the first link I posted. You said you had not, but that may have changed.

    Yes, my reply to it is above.

    [font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung[font="Arial Black"]
    Proactive Performance Solutions, Inc.
    [/font]
    [font="Verdana"] "Performance is our middle name."[/font]

  • Ah, good, right. I see. Cool.

    Yes worker time is CPU time spent by a SQL Server worker thread on a particular task, in the RUNNING state (as opposed to RUNNABLE or SUSPENDED, for example). I have rather picked up the terminology used in the various parts of Books Online that refer to schedulers, workers, and such.

    I am pleased that there is a good deal of common ground between us, I look forward to you next post. 🙂

    Paul

  • I just stumbled across that Simple Talk article .

    I found it very interesting and descriptive. To me the most important message is to not only rely on the results of statistics IO but to analyze the execution plan as well.

    I just want to forward this message to the OP... 😉



    Lutz
    A pessimist is an optimist with experience.

    How to get fast answers to your question[/url]
    How to post performance related questions[/url]
    Links for Tally Table [/url] , Cross Tabs [/url] and Dynamic Cross Tabs [/url], Delimited Split Function[/url]

  • lmu92 (3/1/2010)


    ...to not only rely on the results of statistics IO but to analyze the execution plan as well.

    I think we can all agree on that one. In fact, I start by looking at the plan. It's like looking at The Matrix 😀

  • lmu92 (3/1/2010)


    I just stumbled across that Simple Talk article .

    I found it very interesting and descriptive. To me the most important message is to not only rely on the results of statistics IO but to analyze the execution plan as well.

    I just want to forward this message to the OP... 😉

    I know you know this, Lutz... just saying this in case anyone thinks otherwise....

    Don't rely on the execution plan for differenences in performance either. Sure, it gives you a lot of good hints as to where a performance problem can be but it's pretty simple to make code where one half the code says it'll take 100% even after it runs but is still the clear an absolute winner according to profiler and an old stop watch.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff Moden (3/1/2010)


    ...

    I know you know this, Lutz... just saying this in case anyone thinks otherwise....

    Don't rely on the execution plan for differenences in performance either. Sure, it gives you a lot of good hints as to where a performance problem can be but it's pretty simple to make code where one half the code says it'll take 100% even after it runs but is still the clear an absolute winner according to profiler and an old stop watch.

    I forgot to mention it. It just has been proved on another thread when comparing different methods to query XML data (Paul and Barry are involved, too... ;-))

    (here's the link, if someone's interested...

    When "include actual execution plan" is activated, even STATISTICS TIME or PROFILER will get wrong measures on small sample size (I guess it's due to the time required to transform the query plan into the graphic display we got used to. Not sure though...).

    So I've learned to use STATISTICS IO to ge a "rough idea" how the queries run in terms of scan count and logical/physical reads, actual execution plan to see how the query is actually "translated/transformed" (e.g. for index tuning / query rewrite) and profiler or statistics time setting (WITHOUT capturing actual execution plan data at the same time!) to get the actual run time (knowing that there are a lot of things that will influence those numbers as well).

    All that leading to a single conclusion (at least to me): the OP's original request cannot be answered as simple as she/he would like to see, since the subject is much more complex as to be covered within a single post. 😉



    Lutz
    A pessimist is an optimist with experience.

    How to get fast answers to your question[/url]
    How to post performance related questions[/url]
    Links for Tally Table [/url] , Cross Tabs [/url] and Dynamic Cross Tabs [/url], Delimited Split Function[/url]

  • Well, I've been chewing over exactly how to do what I promised to do (explain my position wrt Logical IOs in clearer terms), in a constructive way that doesn't turn into something the size of a novel, but it's not easy. So, for couple of different reasons, I have decided to go a different way on this.

    First, I am convinced that this is just too big to fit into a Forum post. Secondly, I realize that we have sort of hijacked the OPs thread here, and they may want some relief from us. Thirdly, I want to take this out of the context of proving that anybody is wrong. I just don't feel that that is the right thing for all concerned, and that it's the wrong focus for this subject.

    Finally, Mike Walsh announced the subject[/url] today for the fourth "T-SQL Tuesday" blog contest, and it was, that's right, "IO". Well, I can't ignore a sign like that, especially since I have really let my Blog go since I got sick.

    So what I have decided to do is to kill two birds with one stone and to make my promised reply and explanation of why Logical IO is so important/good for performance evaluation, by doing it as a Blog article. The rules of T-SQL Tuesday give me until next Monday to post it, so that will also give me more time to do this correctly. Once I post it, I will reply back here with a link to it, and we can pick up the conversation from there.

    I do hope that is is satisfactory with everyone (who may still care ... :-)). If not, please let me know and I will try to work out something.

    Thanks,

    [font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung[font="Arial Black"]
    Proactive Performance Solutions, Inc.
    [/font]
    [font="Verdana"] "Performance is our middle name."[/font]

  • RBarryYoung (3/2/2010)


    ...

    First, I am convinced that this is just too big to fit into a Forum post. Secondly, I realize that we have sort of hijacked the OPs thread here, and they may want some relief from us. Thirdly, I want to take this out of the context of proving that anybody is wrong. I just don't feel that that is the right thing for all concerned, and that it's the wrong focus for this subject.

    ...

    I do hope that is is satisfactory with everyone (who may still care ... :-)). If not, please let me know and I will try to work out something.

    Thanks,

    I second your first two arguments. Regarding the third: I think you don't need to prove anybody wrong here since I don't think there's anybody around insisting to be right. 😉

    Your decision to cover it with a blog post is more than fine with me 😀



    Lutz
    A pessimist is an optimist with experience.

    How to get fast answers to your question[/url]
    How to post performance related questions[/url]
    Links for Tally Table [/url] , Cross Tabs [/url] and Dynamic Cross Tabs [/url], Delimited Split Function[/url]

  • Seems like a good plan to me. It kinda bugs me to start blogging too...maybe.

  • lmu92 (3/1/2010)


    Jeff Moden (3/1/2010)


    ...

    I know you know this, Lutz... just saying this in case anyone thinks otherwise....

    Don't rely on the execution plan for differenences in performance either. Sure, it gives you a lot of good hints as to where a performance problem can be but it's pretty simple to make code where one half the code says it'll take 100% even after it runs but is still the clear an absolute winner according to profiler and an old stop watch.

    I forgot to mention it. It just has been proved on another thread when comparing different methods to query XML data (Paul and Barry are involved, too... ;-))

    (here's the link, if someone's interested...

    When "include actual execution plan" is activated, even STATISTICS TIME or PROFILER will get wrong measures on small sample size (I guess it's due to the time required to transform the query plan into the graphic display we got used to. Not sure though...).

    So I've learned to use STATISTICS IO to ge a "rough idea" how the queries run in terms of scan count and logical/physical reads, actual execution plan to see how the query is actually "translated/transformed" (e.g. for index tuning / query rewrite) and profiler or statistics time setting (WITHOUT capturing actual execution plan data at the same time!) to get the actual run time (knowing that there are a lot of things that will influence those numbers as well).

    All that leading to a single conclusion (at least to me): the OP's original request cannot be answered as simple as she/he would like to see, since the subject is much more complex as to be covered within a single post. 😉

    It's not just small sample size that can mess with the execution plan... RBAR will too especially if the RBAR is in the form of recursion... turn on the actual execution plan and then run this...

    /****************************************************************************************

    Purpose:

    This code demonstrates that the estimated and actual execution plans in SQL Server can

    be very INCORRECT and that the execution plan should only be relied on to provide hints

    as to what may be wrong with a query rather than an absolute indication. This code runs

    in SQL Server 2005 only.

    The code creates a temp table for 10 years worth of dates starting with 2000-01-01 using

    two different methods. The first method uses a recursive CTE and the second method uses

    an on-the-fly creation of a "Tally" table. The output of each method is directed to a

    temp table to take display delays out of the picture.

    --Jeff Moden (28 Sep 2009)

    ****************************************************************************************/

    --=======================================================================================

    -- Setup

    --=======================================================================================

    --===== Suppress the auto-display of rowcounts for performance, appearance, and to keep

    -- from giving GUI code false error indications.

    SET NOCOUNT ON;

    --===== Conditionally drop the temp tables that will be formed by this exercise.

    IF OBJECT_ID('TempDB..#RecursiveResult','U') IS NOT NULL

    DROP TABLE #RecursiveResult;

    IF OBJECT_ID('TempDB..#TallyResult','U') IS NOT NULL

    DROP TABLE #TallyResult;

    GO

    PRINT '========== Recursive method ==========';

    --=======================================================================================

    -- Recursive method provides the necessary row count. Although it seems fast and

    -- shows up in the execution plan as taking 0% of the batch, it's terribly slow

    -- compared to the "Tally" method coming up...

    --=======================================================================================

    --===== Turn on some performance counters ===============================================

    SET STATISTICS IO ON;

    SET STATISTICS TIME ON;

    --===== Execute the code being tested ===================================================

    DECLARE @DateVal DATETIME;

    SET @DateVal = '2000-01-01';

    WITH

    MyCTE AS

    (

    SELECT @DateVal AS DateVal

    UNION ALL

    SELECT DateVal + 1

    FROM MyCTE

    WHERE DateVal + 1 < DATEADD(yy, 10, @DateVal)

    )

    SELECT dates.DateVal

    INTO #RecursiveResult

    FROM MyCTE dates

    OPTION (MAXRECURSION 0)

    ;

    --===== Turn off the performance counters and print a separator =========================

    SET STATISTICS TIME OFF;

    SET STATISTICS IO OFF;

    PRINT REPLICATE('=',80);

    GO

    PRINT '========== Tally table method ==========';

    --=======================================================================================

    -- Tally table method by Jeff Moden

    --=======================================================================================

    --===== Turn on some performance counters ===============================================

    SET STATISTICS IO ON;

    SET STATISTICS TIME ON;

    --===== Execute the code being tested ===================================================

    DECLARE @StartDate AS DATETIME;

    SET @StartDate = '2000-01-01';

    SELECT TOP (DATEDIFF(dd,@StartDate,DATEADD(yy,10,@StartDate)))

    @StartDate + ROW_NUMBER() OVER (ORDER BY ac1.Object_ID) -1 AS DateVal

    INTO #TallyResult

    FROM Master.sys.All_Columns ac1

    CROSS JOIN --Intentional constrained cross-join provides the row source

    Master.sys.All_Columns ac2

    ;

    --===== Turn off the performance counters and print a separator =========================

    SET STATISTICS TIME OFF;

    SET STATISTICS IO OFF;

    PRINT REPLICATE('=',80);

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff Moden


    Don't rely on the execution plan for differences in performance either.

    A skilled and experienced person can tell more about performance from an 'actual' execution plan than profiler will ever tell you. Sure, estimated plans are often wrong, but that is missing the point. The estimated plan reveals a lot about the optimization process. Comparing an estimated plan to an actual plan can give deep insight in how to write a better query, or which statistics are missing or out of date.

    Jeff Moden


    It's not just small sample size that can mess with the execution plan... RBAR will too especially if the RBAR is in the form of recursion... turn on the actual execution plan and then run this...

    This is a great example. I can't help but mention the difference in CPU time for those two executions by the way.

    Again, though, a skilled and experienced DBA wouldn't need to run that repro to predict that the second one would be much faster. Table and index spools, recursion, and a poor estimate of the number of executions required all make it obvious.

    Consider the plan results for the recursive example, and look at the 'estimated executions' versus 'actual executions' for the lowest level of the plan (the branch with the filter). Estimated executions = 2, actual executions = 3652. That tells us why it is slower, and also tell us that the optimizer was unable to make a reasonable prediction for the number of recursions that would be required.

    You can't get that from logical reads. Consider this: the logical reads for the recursive CTE total 21,919. The reads for the tally method total 53 - a ratio of 413:1. CPU totals are 62ms and 16ms respectively, a ratio of 4:1. Elapsed times are 73ms and 16ms, a ratio of 4.5:1. Tell me which is the better guide anyone?

    Paul

Viewing 15 posts - 16 through 30 (of 30 total)

You must be logged in to reply to this topic. Login to reply