January 20, 2016 at 2:30 pm
How to optimise the following query, it's doing a Clustered Index Scan (Clustered).
select document.DocumentID as 'document.document.id',
document.DocFolderID as 'document.document.folderID',
document.FileName as 'document.document.description',
document.FileSize as 'document.document.fileSize',
document.ModifiedDate as 'document.document.modifiedDate',
document.TableName as 'document.document.tableName',
document.EntityCode as 'document.document.entityCode'
from Document where (DocFolderID = 1) and (EntityCode is null OR (TableName='Employee' AND EntityCode='xxxxxx')) AND PublishToWeb='1'
(12 row(s) affected)
Table 'Document'. Scan count 1, logical reads 2842, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
January 20, 2016 at 2:41 pm
bebe022476 (1/20/2016)
How to optimise the following query, it's doing a Clustered Index Scan (Clustered).select document.DocumentID as 'document.document.id',
document.DocFolderID as 'document.document.folderID',
document.FileName as 'document.document.description',
document.FileSize as 'document.document.fileSize',
document.ModifiedDate as 'document.document.modifiedDate',
document.TableName as 'document.document.tableName',
document.EntityCode as 'document.document.entityCode'
from Document where (DocFolderID = 1) and (EntityCode is null OR (TableName='Employee' AND EntityCode='xxxxxx')) AND PublishToWeb='1'
(12 row(s) affected)
Table 'Document'. Scan count 1, logical reads 2842, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
1) ORs are very difficult for the optimizer to get right. Consider a compound statistic on the columns involved.
2) Much more importantly, why do you think the optimizer should NOT be doing a table scan? That can very well be the most efficient way to access a given table.
Best,
Kevin G. Boles
SQL Server Consultant
SQL MVP 2007-2012
TheSQLGuru on googles mail service
January 20, 2016 at 2:44 pm
Table definition, index definitions and execution plan please.
But first, is the query really a problem?
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
January 20, 2016 at 2:49 pm
bebe022476 (1/20/2016)
How to optimise the following query, it's doing a Clustered Index Scan (Clustered).select document.DocumentID as 'document.document.id',
document.DocFolderID as 'document.document.folderID',
document.FileName as 'document.document.description',
document.FileSize as 'document.document.fileSize',
document.ModifiedDate as 'document.document.modifiedDate',
document.TableName as 'document.document.tableName',
document.EntityCode as 'document.document.entityCode'
from Document where (DocFolderID = 1) and (EntityCode is null OR (TableName='Employee' AND EntityCode='xxxxxx')) AND PublishToWeb='1'
(12 row(s) affected)
Table 'Document'. Scan count 1, logical reads 2842, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
What is the problem?
😎
January 20, 2016 at 2:50 pm
GilaMonster (1/20/2016)
Table definition, index definitions and execution plan please.But first, is the query really a problem?
Sorry Gail, didn't see your post, makes mine look kind of daft:-P
😎
January 20, 2016 at 3:03 pm
Here's my standard pattern for optimizing a query that needs OR. It might give you two index seeks instead of an index scan, but only if there are usable indexes. And then it might still not be faster - sometimes a scan is the fastest way to get what you need. And less than 3000 logical reads is usually not a big deal.
SELECT bla bla
FROM Document
WHERE DocFolderID = 1
AND PublishToWeb='1'
AND EntityCode IS NULL
UNION ALL
SELECT bla bla
FROM Document
WHERE DocFolderID = 1
AND PublishToWeb='1'
AND TableName='Employee'
AND EntityCode='xxxxxx'
-- AND EntityCode IS NOT NULL -- Not needed in this case because of EntityCode = 'xxxxx', but required if conditions are not mutually exclusive
;
January 20, 2016 at 5:10 pm
HI Gail,
Here it is:
DocumentID(PK,int,not null)
DocFolderID(int,null)
PublishToWeb(bit,null)
Filename(varchar(100),null)
Filesize(int,null)
ModifiedDate(datetime,null)
TableName(varchar(40),null)
EntityCode(varchar(12),null)
I'm not sure how to attached the jpg screenshot here but the execution plan only has 100% Clustered Index Scan(Clustered) on [document],[pk_document] .
The job normally runs for 2 hours and since looking at optimising the job, this is only one of the expensive queries in the job.
Thanks for your help
January 20, 2016 at 9:38 pm
You are doing < 3000 page reads with this query. You could do that at floppy-disk speed (which many of my clients actually have for their IO performance!!) and STILL finish the query in maybe a few tens of seconds. This job of yours may have lots of problems but I cannot see how this one could be one of them - unless it is being blocked I suppose.
Use sp_whoisactive to see real-time performance and issues such as long-duration blocking.
Best,
Kevin G. Boles
SQL Server Consultant
SQL MVP 2007-2012
TheSQLGuru on googles mail service
January 21, 2016 at 2:22 am
bebe022476 (1/20/2016)
HI Gail,Here it is:
DocumentID(PK,int,not null)
DocFolderID(int,null)
PublishToWeb(bit,null)
Filename(varchar(100),null)
Filesize(int,null)
ModifiedDate(datetime,null)
TableName(varchar(40),null)
EntityCode(varchar(12),null)
I'm not sure how to attached the jpg screenshot here but the execution plan only has 100% Clustered Index Scan(Clustered) on [document],[pk_document] .
The job normally runs for 2 hours and since looking at optimising the job, this is only one of the expensive queries in the job.
Thanks for your help
Don't bother attaching the screen capture. That's a waste of time. Instead, right click on the plan and select "Save As." You can then save the plan as a .sqlplan file. That can be attached and uploaded here. When working with execution plans, the picture is nice, but the interesting pieces of information are in the properties of all the operators. That's why the plan is more important than the picture.
"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt
Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning
January 21, 2016 at 2:58 am
bebe022476 (1/20/2016)
HI Gail,Here it is:
DocumentID(PK,int,not null)
DocFolderID(int,null)
PublishToWeb(bit,null)
Filename(varchar(100),null)
Filesize(int,null)
ModifiedDate(datetime,null)
TableName(varchar(40),null)
EntityCode(varchar(12),null)
I'm not sure how to attached the jpg screenshot here but the execution plan only has 100% Clustered Index Scan(Clustered) on [document],[pk_document] .
The job normally runs for 2 hours and since looking at optimising the job, this is only one of the expensive queries in the job.
Thanks for your help
Grant already replied with instructions on how to attach the execution plan. If possible, use an actual execution plan as this contains more useful information than an estimated execution plan.
Have you ever tried just executing this query manually from a query window to see how long it runs? Like Kevin, I have a hard time imagining how a query that process just a few thousand pages would run for hours.
For optimization purposes, please provide more information about the table - you posted a column list, but we definitely need to know all the indexes that are defined on the table as well. If you have any CHECK-, UNIQUE-, or FOREIGN KEY-constraints, share them too.
You mention that the Clustered Index Scan is 100% - I assume you are talking about the percentages displayed in the execution plan? Please stop trusting them. They are at best a loose indication based on abstractions and assumptions that are outdated by two decades. Plus, those numbers are based on estimated row counts (even in an actual execution plan), so if the estimates are wrong they mean even less than they already do.
The same goes for the percentage-cost of this plan in relation to the entire batch, if you looked at that (which I think you did, based on your statement that this is one of the expensive queries). This is also based on estimated costs, so goes back to the same outdated costing model and the same caveats about incorrect estimated row counts.
The best way to really find out which query or queries in a batch or job are causing slowness are to have an extended events session capture relevant events while the job runs. Second best is to manually execute the statements one by one and track how long each one takes, but that is not always possible (and sometimes differences between the settings of SSMS vs SQL Agent or differences in concurrent workload while the queries are running make the results of such a test completely different from the actual run.
Yet another way could be to create a logging table and add INSERT statements that add a row in that table to track progress - make sure to include a datestamp column so that you can later verify how long each step took. This will not work if the job fails and has to roll back, because then the inserts into the logging table will also be rolled back.
Viewing 10 posts - 1 through 10 (of 10 total)
You must be logged in to reply to this topic. Login to reply