Clustered Index Scan

  • How to optimise the following query, it's doing a Clustered Index Scan (Clustered).

    select document.DocumentID as 'document.document.id',

    document.DocFolderID as 'document.document.folderID',

    document.FileName as 'document.document.description',

    document.FileSize as 'document.document.fileSize',

    document.ModifiedDate as 'document.document.modifiedDate',

    document.TableName as 'document.document.tableName',

    document.EntityCode as 'document.document.entityCode'

    from Document where (DocFolderID = 1) and (EntityCode is null OR (TableName='Employee' AND EntityCode='xxxxxx')) AND PublishToWeb='1'

    (12 row(s) affected)

    Table 'Document'. Scan count 1, logical reads 2842, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

  • bebe022476 (1/20/2016)


    How to optimise the following query, it's doing a Clustered Index Scan (Clustered).

    select document.DocumentID as 'document.document.id',

    document.DocFolderID as 'document.document.folderID',

    document.FileName as 'document.document.description',

    document.FileSize as 'document.document.fileSize',

    document.ModifiedDate as 'document.document.modifiedDate',

    document.TableName as 'document.document.tableName',

    document.EntityCode as 'document.document.entityCode'

    from Document where (DocFolderID = 1) and (EntityCode is null OR (TableName='Employee' AND EntityCode='xxxxxx')) AND PublishToWeb='1'

    (12 row(s) affected)

    Table 'Document'. Scan count 1, logical reads 2842, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    1) ORs are very difficult for the optimizer to get right. Consider a compound statistic on the columns involved.

    2) Much more importantly, why do you think the optimizer should NOT be doing a table scan? That can very well be the most efficient way to access a given table.

    Best,
    Kevin G. Boles
    SQL Server Consultant
    SQL MVP 2007-2012
    TheSQLGuru on googles mail service

  • Table definition, index definitions and execution plan please.

    But first, is the query really a problem?

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • bebe022476 (1/20/2016)


    How to optimise the following query, it's doing a Clustered Index Scan (Clustered).

    select document.DocumentID as 'document.document.id',

    document.DocFolderID as 'document.document.folderID',

    document.FileName as 'document.document.description',

    document.FileSize as 'document.document.fileSize',

    document.ModifiedDate as 'document.document.modifiedDate',

    document.TableName as 'document.document.tableName',

    document.EntityCode as 'document.document.entityCode'

    from Document where (DocFolderID = 1) and (EntityCode is null OR (TableName='Employee' AND EntityCode='xxxxxx')) AND PublishToWeb='1'

    (12 row(s) affected)

    Table 'Document'. Scan count 1, logical reads 2842, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    What is the problem?

    😎

  • GilaMonster (1/20/2016)


    Table definition, index definitions and execution plan please.

    But first, is the query really a problem?

    Sorry Gail, didn't see your post, makes mine look kind of daft:-P

    😎

  • Here's my standard pattern for optimizing a query that needs OR. It might give you two index seeks instead of an index scan, but only if there are usable indexes. And then it might still not be faster - sometimes a scan is the fastest way to get what you need. And less than 3000 logical reads is usually not a big deal.

    SELECT bla bla

    FROM Document

    WHERE DocFolderID = 1

    AND PublishToWeb='1'

    AND EntityCode IS NULL

    UNION ALL

    SELECT bla bla

    FROM Document

    WHERE DocFolderID = 1

    AND PublishToWeb='1'

    AND TableName='Employee'

    AND EntityCode='xxxxxx'

    -- AND EntityCode IS NOT NULL -- Not needed in this case because of EntityCode = 'xxxxx', but required if conditions are not mutually exclusive

    ;


    Hugo Kornelis, SQL Server/Data Platform MVP (2006-2016)
    Visit my SQL Server blog: https://sqlserverfast.com/blog/
    SQL Server Execution Plan Reference: https://sqlserverfast.com/epr/

  • HI Gail,

    Here it is:

    DocumentID(PK,int,not null)

    DocFolderID(int,null)

    PublishToWeb(bit,null)

    Filename(varchar(100),null)

    Filesize(int,null)

    ModifiedDate(datetime,null)

    TableName(varchar(40),null)

    EntityCode(varchar(12),null)

    I'm not sure how to attached the jpg screenshot here but the execution plan only has 100% Clustered Index Scan(Clustered) on [document],[pk_document] .

    The job normally runs for 2 hours and since looking at optimising the job, this is only one of the expensive queries in the job.

    Thanks for your help

  • You are doing < 3000 page reads with this query. You could do that at floppy-disk speed (which many of my clients actually have for their IO performance!!) and STILL finish the query in maybe a few tens of seconds. This job of yours may have lots of problems but I cannot see how this one could be one of them - unless it is being blocked I suppose.

    Use sp_whoisactive to see real-time performance and issues such as long-duration blocking.

    Best,
    Kevin G. Boles
    SQL Server Consultant
    SQL MVP 2007-2012
    TheSQLGuru on googles mail service

  • bebe022476 (1/20/2016)


    HI Gail,

    Here it is:

    DocumentID(PK,int,not null)

    DocFolderID(int,null)

    PublishToWeb(bit,null)

    Filename(varchar(100),null)

    Filesize(int,null)

    ModifiedDate(datetime,null)

    TableName(varchar(40),null)

    EntityCode(varchar(12),null)

    I'm not sure how to attached the jpg screenshot here but the execution plan only has 100% Clustered Index Scan(Clustered) on [document],[pk_document] .

    The job normally runs for 2 hours and since looking at optimising the job, this is only one of the expensive queries in the job.

    Thanks for your help

    Don't bother attaching the screen capture. That's a waste of time. Instead, right click on the plan and select "Save As." You can then save the plan as a .sqlplan file. That can be attached and uploaded here. When working with execution plans, the picture is nice, but the interesting pieces of information are in the properties of all the operators. That's why the plan is more important than the picture.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • bebe022476 (1/20/2016)


    HI Gail,

    Here it is:

    DocumentID(PK,int,not null)

    DocFolderID(int,null)

    PublishToWeb(bit,null)

    Filename(varchar(100),null)

    Filesize(int,null)

    ModifiedDate(datetime,null)

    TableName(varchar(40),null)

    EntityCode(varchar(12),null)

    I'm not sure how to attached the jpg screenshot here but the execution plan only has 100% Clustered Index Scan(Clustered) on [document],[pk_document] .

    The job normally runs for 2 hours and since looking at optimising the job, this is only one of the expensive queries in the job.

    Thanks for your help

    Grant already replied with instructions on how to attach the execution plan. If possible, use an actual execution plan as this contains more useful information than an estimated execution plan.

    Have you ever tried just executing this query manually from a query window to see how long it runs? Like Kevin, I have a hard time imagining how a query that process just a few thousand pages would run for hours.

    For optimization purposes, please provide more information about the table - you posted a column list, but we definitely need to know all the indexes that are defined on the table as well. If you have any CHECK-, UNIQUE-, or FOREIGN KEY-constraints, share them too.

    You mention that the Clustered Index Scan is 100% - I assume you are talking about the percentages displayed in the execution plan? Please stop trusting them. They are at best a loose indication based on abstractions and assumptions that are outdated by two decades. Plus, those numbers are based on estimated row counts (even in an actual execution plan), so if the estimates are wrong they mean even less than they already do.

    The same goes for the percentage-cost of this plan in relation to the entire batch, if you looked at that (which I think you did, based on your statement that this is one of the expensive queries). This is also based on estimated costs, so goes back to the same outdated costing model and the same caveats about incorrect estimated row counts.

    The best way to really find out which query or queries in a batch or job are causing slowness are to have an extended events session capture relevant events while the job runs. Second best is to manually execute the statements one by one and track how long each one takes, but that is not always possible (and sometimes differences between the settings of SSMS vs SQL Agent or differences in concurrent workload while the queries are running make the results of such a test completely different from the actual run.

    Yet another way could be to create a logging table and add INSERT statements that add a row in that table to track progress - make sure to include a datestamp column so that you can later verify how long each step took. This will not work if the job fails and has to roll back, because then the inserts into the logging table will also be rolled back.


    Hugo Kornelis, SQL Server/Data Platform MVP (2006-2016)
    Visit my SQL Server blog: https://sqlserverfast.com/blog/
    SQL Server Execution Plan Reference: https://sqlserverfast.com/epr/

Viewing 10 posts - 1 through 10 (of 10 total)

You must be logged in to reply to this topic. Login to reply