November 20, 2008 at 9:49 pm
Hi,
Need a little help here. I have a database that is provided by our data vendor ( that is, I don't control the schema ) and was given a sample query which I am trying to optimise ( Yes, I know is a long query ... )
SELECT T.Code, T.Measure, T.Currency_ as ParentFlag, T.Broker, T.Analyst, T.PerDate, T.EstDate, T.Value_ as Estimate,
T.RevDate, (CASE WHEN EXC.EXCLUDEDATE <= '20081121' THEN 1 ELSE 0 end ) as ExcludeFlag
FROM IBGDESTL1 T
LEFT JOIN qai.dbo.IBGDEXCL1 EXC
ON T.CODE = EXC.CODE
AND T.MEASURE = EXC.MEASURE
AND T.BROKER = EXC.BROKER
AND T.ANALYST = EXC.ANALYST
AND T.PERTYPE = EXC.PERTYPE
AND T.PERDATE = EXC.PERDATE
AND T.ESTDATE = EXC.ESTDATE
LEFT JOIN IBGDSTPL1 S
ON T.CODE = S.CODE
AND T.MEASURE = S.MEASURE
AND T.BROKER = S.BROKER
AND T.PERTYPE = S.PERTYPE
AND T.PERDATE = S.PERDATE
AND S.STOPDATE = (SELECT MAX(STOPDATE) FROM IBGDSTPL1
WHERE CODE = S.CODE AND S.MEASURE = MEASURE
AND BROKER = S.BROKER
AND PERTYPE = S.PERTYPE
AND PERDATE = S.PERDATE
AND STOPDATE <= '20081121' )
LEFT JOIN IBGDACTL1 ACT
ON ACT.CODE = T.CODE
AND ACT.MEASURE = T.MEASURE
AND ACT.PERTYPE = 'A'
AND ACT.PERDATE = T.PERDATE
WHERE T.Code in ( 95256, 29, 39)
AND T.measure in ( 8, 9)
AND T.EstDate = ( SELECT MAX(R.EstDate)
FROM IBGDESTL1 R
WHERE T.PERTYPE = 1
AND R.Code = T.Code
AND R.Measure= T.Measure
AND R.Broker = T.Broker
AND R.PerType = T.PerType
AND R.PerDate = T.PerDate
AND R.EstDate <= '20081121')
AND (T.REVDATE > S.STOPDATE
OR S.STOPDATE IS NULL
OR S.STOPDATE > '20081121' )
AND (ACT.RPTDATE >= '20081121'
OR ( ACT.RPTDATE IS NULL
AND T.PerDate >= '20081121'))
The IO Statistics is ...
Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'IbgdStpL3'. Scan count 1, logical reads 14604, physical reads 2, read-ahead reads 14486, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'IbgdExcL3'. Scan count 1, logical reads 34446, physical reads 1, read-ahead reads 673, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'IbgdEstL3'. Scan count 76, logical reads 289036, physical reads 835, read-ahead reads 6668, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'IbgdActL3'. Scan count 1, logical reads 44744, physical reads 2, read-ahead reads 41531, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
But ... what I want to do is to reduce the number of rows returned, with 2 additional requirements ...
AND T.PERTYPE = 1
AND T.PERDATE = '20090630'
The end results from adding the 2 lines actually made the query run 5 times slower. Here is the IO Statistics ...
Table 'IbgdEstL3'. Scan count 76, logical reads 43990, physical reads 8, read-ahead reads 6318, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'IbgdStpL3'. Scan count 3, logical reads 16053, physical reads 54, read-ahead reads 14434, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'IbgdActL3'. Scan count 3, logical reads 49458, physical reads 87, read-ahead reads 44650, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Worktable'. Scan count 579, logical reads 36416, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'IbgdExcL3'. Scan count 3, logical reads 38163, physical reads 253, read-ahead reads 34161, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
The thing is, I am not doing a table scan, the clustered index on the main table 'IbgdEstL3' has the following key column ( in order )...
Code, Measure, Broker, PerType, PerDate, EstDate, Analyst
Normally, I would assume that by restriction the data returned using columns that are part of the primary key would make it quicker, but why am I seeing this running 4-5 times slower ?? Any ideas how I can speed this up ? I am running this on SQL Server 2005.
Thanks
E
November 21, 2008 at 8:03 am
eric.lam (11/20/2008)
But ... what I want to do is to reduce the number of rows returned, with 2 additional requirements ...
AND T.PERTYPE = 1
AND T.PERDATE = '20090630'
The thing is, I am not doing a table scan, the clustered index on the main table 'IbgdEstL3' has the following key column ( in order )...
Code, Measure, Broker, PerType, PerDate, EstDate, Analyst
Looking at a slightly reformatted version of the query, to make it easier to read:
SELECT T.Code, T.Measure, T.Currency_ as ParentFlag, T.Broker, T.Analyst, T.PerDate, T.EstDate, T.Value_ as Estimate,
T.RevDate, (CASE WHEN EXC.EXCLUDEDATE <= '20081121' THEN 1 ELSE 0 end ) as ExcludeFlag
FROM IBGDESTL1 T
LEFT JOIN qai.dbo.IBGDEXCL1 EXC
ON T.CODE = EXC.CODE
AND T.MEASURE = EXC.MEASURE
AND T.BROKER = EXC.BROKER
AND T.ANALYST = EXC.ANALYST
AND T.PERTYPE = EXC.PERTYPE
AND T.PERDATE = EXC.PERDATE
AND T.ESTDATE = EXC.ESTDATE
LEFT JOIN IBGDSTPL1 S
ON T.CODE = S.CODE
AND T.MEASURE = S.MEASURE
AND T.BROKER = S.BROKER
AND T.PERTYPE = S.PERTYPE
AND T.PERDATE = S.PERDATE
AND S.STOPDATE = (SELECT MAX(STOPDATE) FROM IBGDSTPL1
WHERE CODE = S.CODE AND S.MEASURE = MEASURE
AND BROKER = S.BROKER
AND PERTYPE = S.PERTYPE
AND PERDATE = S.PERDATE
AND STOPDATE <= '20081121' )
LEFT JOIN IBGDACTL1 ACT
ON ACT.CODE = T.CODE
AND ACT.MEASURE = T.MEASURE
AND ACT.PERTYPE = 'A'
AND ACT.PERDATE = T.PERDATE
WHERE T.Code in ( 95256, 29, 39)
AND T.measure in ( 8, 9)
AND T.EstDate = ( SELECT MAX(R.EstDate)
FROM IBGDESTL1 R
WHERE T.PERTYPE = 1
AND R.Code = T.Code
AND R.Measure= T.Measure
AND R.Broker = T.Broker
AND R.PerType = T.PerType
AND R.PerDate = T.PerDate
AND R.EstDate <= '20081121')
AND (T.REVDATE > S.STOPDATE
OR S.STOPDATE IS NULL
OR S.STOPDATE > '20081121' )
AND (ACT.RPTDATE >= '20081121'
OR ( ACT.RPTDATE IS NULL
AND T.PerDate >= '20081121'))
it looks like it won't use the primary key index to resolve your WHERE clause criteria on PerType and PerDate because there is the third column Broker which does not have a WHERE clause criteria on it. Since PerType and PerDate are after Broker in the index's column order, SQL Server has to scan all of the rows that meet your Code and Measure criteria and then check PerType and PerDate criteria separately over those scanned rows.
For multiple column indexes like this, it's typically better to put the columns most frequently used in WHERE clauses first, as long as they (or the combination of the multiple columns) would be fairly selective. So if Broker is seldom used as a WHERE clause criteria then it may be more optimal to have it after PerType and PerDate in the index column order, but this needs to be considered system wide, not just for this one query.
If it's typically used throughout the rest of the system, then maybe a separate index could be created for the minority cases like this that don't use Broker as a WHERE clause criteria.
November 21, 2008 at 9:03 am
You may also want to look at converting this subquery in the where clause:
( SELECT MAX(R.EstDate)
FROM IBGDESTL1 R
WHERE T.PERTYPE = 1
AND R.Code = T.Code
AND R.Measure= T.Measure
AND R.Broker = T.Broker
AND R.PerType = T.PerType
AND R.PerDate = T.PerDate
AND R.EstDate <= '20081121')
To a derived table and joining on it.
Also by referencing right hand tables in your where clause you are essentially creating an INNER JOIN so you may want to re-evaluate have this in your WHERE and move them to the appropriate JOIN clauses:
AND (T.REVDATE > S.STOPDATE
OR S.STOPDATE IS NULL
OR S.STOPDATE > '20081121' )
AND (ACT.RPTDATE >= '20081121'
OR ( ACT.RPTDATE IS NULL
AND T.PerDate >= '20081121'))
Jack Corbett
Consultant - Straight Path Solutions
Check out these links on how to get faster and more accurate answers:
Forum Etiquette: How to post data/code on a forum to get the best help
Need an Answer? Actually, No ... You Need a Question
November 21, 2008 at 12:20 pm
Thanks. Yes, there are other indexes on the table as well, but they all have broker on it. Because this is a vendor provided database, I don't have control over the indexes either.
I ended up selecting everything and then filtering out ( applying the where ) in the code that retrieve the data. I just can't believe that after looking into this for so long, I have to use the stupidest solution in the book.
Viewing 4 posts - 1 through 4 (of 4 total)
You must be logged in to reply to this topic. Login to reply