IO Statistics

  • Hi again Gail,

    Sure. Everything you say is true. The only thing I would say (in a small voice) is that my previous comments were 'off the top of my head' - you clearly spent more time on that last one! I do know really that a seek can return 0 or many rows - I was thinking of a key (+ unique-ifier) or RID lookup when I wrote that. The brain was in a different place from the fingers :blush:

    The points I was failing to make clearly are:

    1. What we understand and see in plans as scans and seeks won't necessarily tally exactly with stats IO ideas of scans and seeks.

    2. The range seek in your article is not all that different from a full scan really.

    Point 1

    If you look at the XML show plan for example, you'll see all sorts of index scan, seek, and range references, nested within the same operation. Some 'seeks' are EQ operations, others are GE and LE ranges, and so on.

    Just because that is the logical plan we can see, doesn't mean that the deep dark internals of the engine aren't free to do it slightly differently in practice, or to record different things as 'seeks' or 'scans' if they choose to.

    You mentioned semantics in your last post - I think there is a lot to that - and it is also true that 'scan' in particular seems to have different meanings in different places (e.g the tooltip).

    I also think that the terms in question might mean different things in different places. I'm happier with that answer than thinking that the reported scan count is somehow 'random' or just 'wrong'.

    Point 2

    So a full scan runs through the whole thing, and a range scan seeks to find a start and an end. The middle bit is quite similar though, eh? And sure, sometimes we'll get the thing traversing a linked list, and sometimes it'll scan from the IAM. That complexity doesn't change the fact that the operations are extremely similar aside from the initial seek: It's not as if a range seek starts from the very top of index and seeks down the b-tree to the leaf for every row. For a range-start, range-end operation I mean. Multiple prefix EQ seeks (e.g IN query), or key/RID lookups from a loop join, sure they do traverse the structure that way.

    Finally, yes prefetch is read-ahead and in a separate area. But in principle, those actions are physically a scan, even if the pages fetched that way are then seeked on a row basis, or range-scanned - it doesn't matter that it's in a separate area, pages were scanned to satisfy the query. I'm just sayin'.

    I'm not trying to be difficult at all, just explaining that I'm talking in a looser, wider sense. Hey I hope that makes some sort of sense to you.

    Thanks though for the articulate and accurate post, appreciated 🙂

  • GilaMonster (4/2/2009)


    Grant Fritchey (3/27/2009)


    1. The first query, by a long shot. Although it has a higher number of physical reads. But if both these queries are accessing the same data then it was probably in cache when you ran the second query.

    Why do you say that? It has higher logical IOs (100 000 vs 80 000) as well as the higher physical

    Mainly based on the scan count. I'm still assuming, possibly incorrectly, that the second query was able to read from cache where as the first had to go to disk. I think if the second query had to go to disk as well, it'd be quite a bit longer.

    However, I wasn't aware that the scans were actually a little less of a decent measure than I had originally believed (by the way, do you just memorize Connect bugs or something) so maybe my original estimate is off.

    Generally, I don't count on the logical IOs as a single measure. I combine them with what I see in the execution plan and, possibly incorrectly, the scans, along with the actual execution times. What's not known here is how the queries were tested. Did the OP clear the cache and the buffers prior to each execution? If not, you're not getting a thorough test. At the very least we'd need to run the queries in reverse order and see how that affects the results.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • Hey Grant,

    You might well be right - the first query does do more logical IO, but that's not the whole story. If the whole table is scanned in allocation order it might be more efficient in practice - might. The other thing is CPU. 2005 often made a bit of a mess of queries against a single partition - especially if the tables were large enough to warrant an parallel plan (one thread per partition), or if a constant scan with a loop join to a lookup or whatever was executed many times. It's impossible to know for sure which is the 'better query' just from looking at the SQL.

    As an aside, the addiction to measuring logical IO alone that some have is a bit one-dimensional. See Joe Chang's excellent blogs on CPU usage and table scanning at . Makes you think!

    Cheers,

    /Paul

  • Yeah, I've read those. I was never completely dependent on IO, but it sure made me second guess even how much I used it.

    But Gail was write to call me on such an unequivocal statement based on inadequate information. I made some undefined assumptions. I should have asked the OP a couple of more questions first.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • Grant Fritchey (4/2/2009)


    Mainly based on the scan count. I'm still assuming, possibly incorrectly, that the second query was able to read from cache where as the first had to go to disk. I think if the second query had to go to disk as well, it'd be quite a bit longer.

    Well they both had to read partially from disk, both have physical IOs.

    Thing with the logical IOs is (if I'm remembering correctly), they're cumulative over all 'scans'. So a query that does 500 reads in one scan is doing the same number of IOs as a query that does 500 in 100 scans

    (by the way, do you just memorize Connect bugs or something)

    No. This came up in the private newsgroups a couple weeks back and the connect was referenced there.

    Generally, I don't count on the logical IOs as a single measure.

    Likewise, but in this case it's all we have so far.

    Of course, in this case it's a moot point since the two queries that we're comparing aren't equivalent. One just has a join, the other a join and a filter.

    Apples and pears.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass

Viewing 5 posts - 16 through 19 (of 19 total)

You must be logged in to reply to this topic. Login to reply