findingdifference between full or partial index scan in graphical execution plan?

  • hi all,

    many times when looking to graphical excution plans I see the operator "clustered index scan". The description says "scanning a clustered index, entirely or only a range". But how do I find out whether it is entirely or only a range scan?

    Here's an example.

    -- ==============================================

    -- Step 1: create data table

    -- ==============================================

    set nocount on;

    create table #TestIndexAccess ( col1 int not null primary key clustered ( col1 ) );

    declare @counter int = 0;

    while @counter <= 500000

    begin

    insert into #TestIndexAccess ( col1 ) values ( @counter )

    set @counter += 1;

    end

    -- ==============================================

    -- Step 2: querying and looking at execution plan

    -- ==============================================

    select min( col1 ) from #TestIndexAccess;

    select count( col1 ) from #TestIndexAccess;

    --drop table #TestIndexAccess;

    go

    The execution plans can be found in the attachment (did not know which file format is preferred so I uploaded two formats).

    Both queries containt the clustered index scan, but how to determine the difference?

    Of course there is a much thicker line for the second statement and of course I can have a look at the logical reads but this would not work for large tables.

    Any ideas?

    Thank's a lot, Wolf

  • Since these are two different queries, you have to examine what they're doing in order to understand what the execution plan is representing to you. One is doing a MIN, the other is doing a COUNT. If you look at the properties of the MIN operation, you can see that the scan is an ordered scan. This means it's using the order of the values to do the scan. Which means it hits the first value and stops. That's why the estimate is only a single row (same as the actual). Logically, this answers your question. The other query requires accessing all the data and then performing an aggregation action (stream aggregate in this case) and it's reflected within the properties again, the scan is unordered and it returns everything.

    Neither of these queries involves a predicate or other filtering mechanism, so seeks are not possible. To know what's happening with a scan with a predicate, you compare the predicate to the data. A predicate involving, for example, a function, means that all data in the table will be scanned. A predicate that returns more than a small percentage of rows will result in a scan (possibly) but will start and stop based on the predicate value. The plan itself only tells you how much data it thought it needed to access (estimated rows) and, with an actual plan, how much data it did access (actual rows). It's up to you to interpret the methods of access.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • Typically a scan is a full scan (in the absence of TOP or other operators that give row goals). It's the seek that can be a single row or a range (up to the entire table)

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • Hello Grant Fritchey,

    thank's a lot for your very detailled explanation. I'm still not sure if I understand the part I'm interested in most.

    I know these queries had no predicate and I'm aware of the differences of index scans and index seeks. The problem I personally have is that many times I see index scan operations and many times I'm not sure if it scans the whole index or just a part of it.

    If I use the following query:

    select top 1000 * from #TestIndexAccess

    I still have a clustered index scan but it is NOT an ordered scan. Still the first query needs only 3 logical reads. I don't think it did a full scan. Now how do you see the difference of this last query to the others?

    If I have a table with several million of rows and the partial scan returns some 100 thousands of them I dont's see a possibility to determine which kind of index access happend.

    Now, did I not understand all of your information or did I forget something else? 🙂

    Thank's a lot, Wolf

  • GilaMonster (11/26/2012)


    Typically a scan is a full scan ...

    Hi Gila,

    that's the point, it seems that it is NOT ALWAYS a full scan. The example shows it, using the min()-Function does not a full scan. Now, only at looking at the query plan, how do I know what it actually is?

    And also the operator description says it could be not a full scan like seen in the attachment.

  • weberharter (11/26/2012)


    select top 1000 * from #TestIndexAccess

    I still have a clustered index scan but it is NOT an ordered scan. Still the first query needs only 3 logical reads. I don't think it did a full scan.

    No, it didn't do a full scan. The operator's the same, you can tell partially from the number of rows outputted and the reads done. It's still a scan, not a seek

    p.s. It's not an ordered scan because there's no order by on your query, hence no requirement for the data to be in any particular order. Put an Order By <clustered index column> and you'll get an ordered scan.

    Ordered scan = scan the index in order and preserve the order

    Unordered scan = scan the index and don't worry about the order

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • GilaMonster (11/26/2012)

    ... you can tell partially from the number of rows outputted...

    Yes, in this case I can. But we have many, many, many tables and I don't know the number of records of any table by heart. So any time I want to know if it's a partial scan I have to determine the full number of rows first?

    The point is, many of our foreign keys reference to columns that are the clustered index so many joins result in clustered index lookups or scans. And that's where I want to know the difference. As the scan is done maybe multiple times I cannot even have a look at the logical reads as they are multiplied.

  • weberharter (11/26/2012)


    GilaMonster (11/26/2012)


    Typically a scan is a full scan ...

    Hi Gila,

    that's the point, it seems that it is NOT ALWAYS a full scan.

    Well if you selectively quote me, yes what you quote will be incorrect.

    What I said was

    "Typically a scan is a full scan (in the absence of TOP or other operators that give row goals)"

    Hence, if you see a scan and there is not something that could stop the scan part way through (Top, Min, Max, Row_Number, a couple of other similar operators) it is a full scan.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • weberharter (11/26/2012)


    The point is, many of our foreign keys reference to columns that are the clustered index so many joins result in clustered index lookups or scans. And that's where I want to know the difference.

    Those will be full scans. Lookups are single row clustered index seeks.

    As I said earlier, there's nothing in the operator itself that will tell you whether it's a full or a partial, but in the absence of TOP or other operators that give row goals, it is a full scan of the entire index.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • GilaMonster (11/26/2012)

    What I said was

    "Typically a scan is a full scan (in the absence of TOP or other operators that give row goals)"

    Hence, if you see a scan and there is not something that could stop the scan part way through (Top, Min, Max, Row_Number, a couple of other similar operators) it is a full scan.

    ah, ok 🙂 I thinkt the "in absence of"-part was the information I needed for understanding. So it seems that there is no "reliable" difference in the query plan. But with looking at the operators used in the quey text I should be able to "guess" what it is.

    Thank's a lot for your help, I think I got it now 🙂

  • weberharter (11/26/2012)


    GilaMonster (11/26/2012)

    What I said was

    "Typically a scan is a full scan (in the absence of TOP or other operators that give row goals)"

    Hence, if you see a scan and there is not something that could stop the scan part way through (Top, Min, Max, Row_Number, a couple of other similar operators) it is a full scan.

    ah, ok 🙂 I thinkt the "in absence of"-part was the information I needed for understanding. So it seems that there is no "reliable" difference in the query plan. But with looking at the operators used in the quey text I should be able to "guess" what it is.

    Thank's a lot for your help, I think I got it now 🙂

    It's not guessing. It's a logical deduction based on the information provided. Going back to your examples, you can see how the access is occurring if you look into the properties of the operator. The MIN function is accessing the data, through a SCAN, but in an ordered fashion, to retrieve a single row (a partial access of the SCAN operation). It's not a guess. The information is right there, plain as day. You just have to interpret it. The execution plan won't provide all answers by itself. You have to work with the query, the statistics, the plan, and a knowledge of how SQL Server operates in order to arrive at a good conclusion.

    As Gail says, a scan is a scan. Assume it. Work from there. Yes, not all scans are created equal and not all scans are bad. Assume it's accessing the entire index unless you have indications (from the properties and the query) otherwise.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

Viewing 11 posts - 1 through 10 (of 10 total)

You must be logged in to reply to this topic. Login to reply