June 29, 2011 at 8:38 am
First off, this query was delivered to my company many years ago when they bought their original BI solution. I am in the process of trying multiple things to make this work faster. The original ETL tool was cognos decision stream 7.1 and I am now migrating it to SSIS 2k8r2 64 bit. I have tried to 'break apart' the query to make it run faster, however my brain just isn't figuring it out. With all the possibilities due to the multiple 'OR' statements in the where clause. This is pulling from Mapics(iSeries) which is a DB2 system. Any help making this faster would be appreciated. I am working diligently on this however I have hit a wall.
SELECT
A.CRDT,
A.STID,
A.ITNBR,
A.ITRV,
A.CFST,
A.INVFG,
A.ITDSC,
A.ENGNO,
A.UNMSR,
A.ITTYP,
A.ITCLS,
A.VNDNR,
A.WEGHT,
A.PTAXI,
A.STAXI,
A.UUSA,
A.UUSB,
A.UUSC,
A.UUCA,
A.UUCB,
A.UUCC,
A.UUQ1,
A.UUA1,
A.UUD1,
A.UU25,
A.UU40,
A.UVHC,
A.UVMC,
A.SHAP,
A.HITE,
A.LONG,
A.IDIA,
A.ODIA,
A.DUOM,
A.WIDE,
B.CHDE,
B.LOTSZ,
B.RCFLG,
B.STDUC,
B.CURUC,
B.CURSU,
B.CTECH,
B.SLTAB,
B.CLTAB,
B.LABHR,
B.CSTAC,
B.CSTAS,
B.UUSA,
B.UUSB,
B.UUSC,
B.UUCA,
B.UUCB,
B.UUCC,
B.UUQ1,
B.UUA1,
B.UUD1,
B.UU40,
B.CMAT,
B.COOT,
B.CPOT,
B.CSLT,
B.CRLT,
B.CSMT,
B.CRMT,
B.COHT,
B.CO1T,
B.CO2T,
B.CO3T,
B.CO4T,
B.CMAL,
B.COOL,
B.CPOL,
B.CSLL,
B.CRLL,
B.CSML,
B.CRML,
B.COHL,
B.CO1L,
B.CO2L,
B.CO3L,
B.CO4L,
B.CRLC,
B.CPTAB,
B.COHC,
B.SMAT,
B.SOOT,
B.SPOT,
B.SSLT,
B.SRLT,
B.SSMT,
B.SRMT,
B.SOHT,
B.SO1T,
B.SO2T,
B.SO3T,
B.SO4T,
B.SMAL,
B.SOOL,
B.SSLL,
B.SRLL,
B.SSML,
B.SRML,
B.SOHL,
B.SO1L,
B.SO2L,
B.SO3L,
B.SO4L,
B.SRLC,
B.SPTAB,
B.SOHC,
B.SPOL,
case when(B.DLCC > 1110628) then(1110628) else(B.DLCC) end DLCC,
case when(B.DLCS > 1110628) then(1110628) else(B.DLCS) end DLCS,
B.MLCC,
B.MLCS,
C.PITD1,
C.PITD2,
C.CMDTY,
C.BUYNO,
C.VIACD,
C.CHGN,
J.JDWRVC,
J.JDWDTC,
J.JCDRER,
J.JBMRVC,
J.JBMDTC,
J.JCBRER,
COG.DEDT,
D.XXMENV
FROM
ITMRVA A
JOIN SITMST E
ON E.STID = A.STID
LEFT JOIN ITMRVB B
ON A.STID = B.STID AND A.ITNBR = B.ITNBR AND A.ITRV = B.ITRV
LEFT JOIN ITMRVC C
ON A.STID = C.STID AND A.ITNBR = C.ITNBR AND A.ITRV = C.ITRV
LEFT JOIN ITMRVJ J
ON A.STID = J.JSTID AND A.ITNBR = J.ITNBR AND A.ITRV = J.JITRV
LEFT JOIN JLGF.COGMAT COG
ON COG.STID = A.STID AND COG.ITNBR = A.ITNBR and COG.ITRV = A.ITRV
LEFT JOIN XFRRMT D
ON D.XXTOWH = A.STID
WHERE
( E.UUSA = '1' OR ((A.STID BETWEEN 'E01' AND 'E99') or (A.STID BETWEEN 'EAA' AND 'EZZ')))
AND
(
(
(A.CHDE > 1110627 AND A.CHDE < 1110628)
OR
(
(A.CHDE = 1110627 AND A.CHTE >= '230000')
OR
(A.CHDE = 1110628 AND A.CHTE <= '230000')
)
)
OR
(
(B.CHDE > 1110627 AND B.CHDE < 1110628)
OR
(
(B.CHDE = 1110627 AND B.CHTE >= '230000')
OR
(B.CHDE = 1110628 AND B.CHTE <= '230000')
)
)
OR
(B.DLCC >= 1110627 AND B.DLCC <= 1110628)
OR
(B.DLCS >= 1110627 AND B.DLCS <= 1110628)
OR
(
(C.CHDE > 1110627 AND C.CHDE < 1110628)
OR
(
(C.CHDE = 1110627 AND C.CHTE >= '230000')
OR
(C.CHDE = 1110628 AND C.CHTE <= '230000')
)
)
OR
(
(J.JCHDE > 1110627 AND J.JCHDE < 1110628)
OR
(
(J.JCHDE = 1110627 AND J.JCHTE >= '230000')
OR
(J.JCHDE = 1110628 AND J.JCHTE <= '230000')
)
)
)
June 29, 2011 at 8:50 am
My sympathy for your problem. but it is impossible to make any recommendations from just looking at this one query. There are many details that need to be considered before someone could make any meaningful recommendation.
I doubt that a simple change to the query will have any significant effect, many because of all the ORs. You may be able to partition the data differently or break the one query up into two or three small queries, or even make use of indexed views
Poor designs only show their ugly nature when the tables start getting large.
-- edit: sorry, just noticed you are talking about DB2 not SQL Server. My generic advice is not far off though, I think you will have to dig into the database design a bit to fix it.
The probability of survival is inversely proportional to the angle of arrival.
June 29, 2011 at 9:20 am
i had ideas like sturner to fiddle with, but i have no idea whether db2 supports some of the things i'd try.
for example, isntead of selecting the entire tables aliased as A,B,C, i might try moving the WHERE statements fromt eh bottom into a subselect;
for example this:
--...
FROM ITMRVA A
--CHANGE TO
FROM ( SELECT ITMRVA.* FROM ITMRVA WHERE (
(A.CHDE > 1110627 AND A.CHDE < 1110628)
OR (A.CHDE = 1110627 AND A.CHTE >= '230000')
OR (A.CHDE = 1110628 AND A.CHTE <= '230000')
)
) A
another thing i'd change is the BETWEENS for some text search into a simple LIKE:
--...
WHERE
( E.UUSA = '1' OR ((A.STID BETWEEN 'E01' AND 'E99') or (A.STID BETWEEN 'EAA' AND 'EZZ')))
--CHANGE TO
( E.UUSA = '1' OR ((A.STID LIKE 'E[0-9,A-Z][0-9,A-Z]'))
the stuff i suggest might perfrom the same way anyway...but it's a couple of ideas.
Lowell
June 29, 2011 at 9:20 am
Well since this is pulling from a source system I can not modify anything on that end. The only hope I have of making this run faster/more efficiently is to rework the query. All the tables have hundreds of millions of rows in, so pulling in the whole table every run isn't an option either. I was more so hoping to tune this as best as possible, knowing that the world I work in isn't ideal at the moment.
June 29, 2011 at 9:21 am
Thanks I will give those a try.
June 29, 2011 at 9:33 am
something else that struck me:
(A.CHDE > 1110627 AND A.CHDE < 1110628)
-- this is a Julian-ish date, century 111 month 06 day 27
--isn't that exactly the so someDatefield =1110627 instead?
(A.CHDE = 1110627)
/code]
edit... i mean not equal whops and edited
actually, unless this is parameterized for a range, the first part would ALWAYS find nothing...no integers are > and also < the same limit, right.? so that section should be removed so there's one less evealuation in each of the 4 sub-clauses for the tables statemetns
Lowell
June 29, 2011 at 9:48 am
Wouldn't the <> bring in records before June 27 as well?
June 29, 2011 at 10:07 am
ok the more i look at this the more i get a glimmer of udnerstanding;
i think in that table two columsn represent theDate and TheTime.
CHDE = 1110627 --the date: 2011 06 27
CHTE = '230000' --the time? 23:00:00?
so the whole thing is just trying to limit the data to a days worth of data...
Lowell
June 29, 2011 at 10:09 am
CORRECT
June 29, 2011 at 10:39 am
i doubt this helps much, but it was fun to look at;
i hate nested OR's and parenthesusi, i get dizzy trying to look through them.
can you compare this query to your original? does it return the same results? I think in trying to simplify, i may have lost something...tough to do without test data to validate agaisnt.
SELECT
A.CRDT, A.STID, A.ITNBR, A.ITRV, A.CFST,
A.INVFG, A.ITDSC, A.ENGNO, A.UNMSR, A.ITTYP,
A.ITCLS, A.VNDNR, A.WEGHT, A.PTAXI, A.STAXI,
A.UUSA, A.UUSB, A.UUSC, A.UUCA, A.UUCB,
A.UUCC, A.UUQ1, A.UUA1, A.UUD1, A.UU25,
A.UU40, A.UVHC, A.UVMC, A.SHAP, A.HITE,
A.LONG, A.IDIA, A.ODIA, A.DUOM, A.WIDE,
B.CHDE, B.LOTSZ, B.RCFLG, B.STDUC, B.CURUC,
B.CURSU, B.CTECH, B.SLTAB, B.CLTAB, B.LABHR,
B.CSTAC, B.CSTAS, B.UUSA, B.UUSB, B.UUSC,
B.UUCA, B.UUCB, B.UUCC, B.UUQ1, B.UUA1,
B.UUD1, B.UU40, B.CMAT, B.COOT, B.CPOT,
B.CSLT, B.CRLT, B.CSMT, B.CRMT, B.COHT,
B.CO1T, B.CO2T, B.CO3T, B.CO4T, B.CMAL,
B.COOL, B.CPOL, B.CSLL, B.CRLL, B.CSML,
B.CRML, B.COHL, B.CO1L, B.CO2L, B.CO3L,
B.CO4L, B.CRLC, B.CPTAB, B.COHC, B.SMAT,
B.SOOT, B.SPOT, B.SSLT, B.SRLT, B.SSMT,
B.SRMT, B.SOHT, B.SO1T, B.SO2T, B.SO3T,
B.SO4T, B.SMAL, B.SOOL, B.SSLL, B.SRLL,
B.SSML, B.SRML, B.SOHL, B.SO1L, B.SO2L,
B.SO3L, B.SO4L, B.SRLC, B.SPTAB, B.SOHC,
B.SPOL,
case when(B.DLCC > 1110628) then(1110628) else(B.DLCC) end DLCC,
case when(B.DLCS > 1110628) then(1110628) else(B.DLCS) end DLCS,
B.MLCC, B.MLCS, C.PITD1, C.PITD2, C.CMDTY,
C.BUYNO, C.VIACD, C.CHGN, J.JDWRVC, J.JDWDTC,
J.JCDRER, J.JBMRVC, J.JBMDTC,J.JCBRER, COG.DEDT,
D.XXMENV
--in theory, thos should limit a LOT of the rows for the query
FROM (SELECT * FROM ITMRVA WHERE CHDE IN (1110627,1110628) ) A
INNER JOIN SITMST E ON E.STID = A.STID
LEFT JOIN (SELECT * FROM ITMRVB WHERE CHDE IN (1110627,1110628) ) B
ON A.STID = B.STID
AND A.ITNBR = B.ITNBR
AND A.ITRV = B.ITRV
LEFT JOIN (SELECT * FROM ITMRVC WHERE CHDE IN (1110627,1110628) ) C
ON A.STID = C.STID
AND A.ITNBR = C.ITNBR
AND A.ITRV = C.ITRV
LEFT JOIN (SELECT * FROM ITMRVJ WHERE CHDE IN (1110627,1110628) ) J
ON A.STID = J.JSTID
AND A.ITNBR = J.ITNBR
AND A.ITRV = J.JITRV
LEFT JOIN JLGF.COGMAT COG ON COG.STID = A.STID
AND COG.ITNBR = A.ITNBR
AND COG.ITRV = A.ITRV
LEFT JOIN XFRRMT D ON D.XXTOWH = A.STID
WHERE
( E.UUSA = '1' OR (A.STID LIKE 'E[0-9,A-Z][0-9,A-Z]'))
AND
(
(
(
(A.CHDE = 1110627 AND A.CHTE >= '230000')
OR
(A.CHDE = 1110628 AND A.CHTE <= '230000')
)
)
OR
(
(
(B.CHDE = 1110627 AND B.CHTE >= '230000')
OR
(B.CHDE = 1110628 AND B.CHTE <= '230000')
)
)
OR
(B.DLCC >= 1110627 AND B.DLCC <= 1110628)
OR
(B.DLCS >= 1110627 AND B.DLCS <= 1110628)
OR
(
(
(C.CHDE = 1110627 AND C.CHTE >= '230000')
OR
(C.CHDE = 1110628 AND C.CHTE <= '230000')
)
)
OR
(
(
(J.JCHDE = 1110627 AND J.JCHTE >= '230000')
OR
(J.JCHDE = 1110628 AND J.JCHTE <= '230000')
)
)
)
Lowell
June 29, 2011 at 10:43 am
Thanks, I will give it a try
June 29, 2011 at 10:56 am
It ran fast as compared to the other query, however its count was 1710 as compared to 2286...
Viewing 12 posts - 1 through 11 (of 11 total)
You must be logged in to reply to this topic. Login to reply