Overly Complex, Way to long running query help.

  • First off, this query was delivered to my company many years ago when they bought their original BI solution. I am in the process of trying multiple things to make this work faster. The original ETL tool was cognos decision stream 7.1 and I am now migrating it to SSIS 2k8r2 64 bit. I have tried to 'break apart' the query to make it run faster, however my brain just isn't figuring it out. With all the possibilities due to the multiple 'OR' statements in the where clause. This is pulling from Mapics(iSeries) which is a DB2 system. Any help making this faster would be appreciated. I am working diligently on this however I have hit a wall.

    SELECT

    A.CRDT,

    A.STID,

    A.ITNBR,

    A.ITRV,

    A.CFST,

    A.INVFG,

    A.ITDSC,

    A.ENGNO,

    A.UNMSR,

    A.ITTYP,

    A.ITCLS,

    A.VNDNR,

    A.WEGHT,

    A.PTAXI,

    A.STAXI,

    A.UUSA,

    A.UUSB,

    A.UUSC,

    A.UUCA,

    A.UUCB,

    A.UUCC,

    A.UUQ1,

    A.UUA1,

    A.UUD1,

    A.UU25,

    A.UU40,

    A.UVHC,

    A.UVMC,

    A.SHAP,

    A.HITE,

    A.LONG,

    A.IDIA,

    A.ODIA,

    A.DUOM,

    A.WIDE,

    B.CHDE,

    B.LOTSZ,

    B.RCFLG,

    B.STDUC,

    B.CURUC,

    B.CURSU,

    B.CTECH,

    B.SLTAB,

    B.CLTAB,

    B.LABHR,

    B.CSTAC,

    B.CSTAS,

    B.UUSA,

    B.UUSB,

    B.UUSC,

    B.UUCA,

    B.UUCB,

    B.UUCC,

    B.UUQ1,

    B.UUA1,

    B.UUD1,

    B.UU40,

    B.CMAT,

    B.COOT,

    B.CPOT,

    B.CSLT,

    B.CRLT,

    B.CSMT,

    B.CRMT,

    B.COHT,

    B.CO1T,

    B.CO2T,

    B.CO3T,

    B.CO4T,

    B.CMAL,

    B.COOL,

    B.CPOL,

    B.CSLL,

    B.CRLL,

    B.CSML,

    B.CRML,

    B.COHL,

    B.CO1L,

    B.CO2L,

    B.CO3L,

    B.CO4L,

    B.CRLC,

    B.CPTAB,

    B.COHC,

    B.SMAT,

    B.SOOT,

    B.SPOT,

    B.SSLT,

    B.SRLT,

    B.SSMT,

    B.SRMT,

    B.SOHT,

    B.SO1T,

    B.SO2T,

    B.SO3T,

    B.SO4T,

    B.SMAL,

    B.SOOL,

    B.SSLL,

    B.SRLL,

    B.SSML,

    B.SRML,

    B.SOHL,

    B.SO1L,

    B.SO2L,

    B.SO3L,

    B.SO4L,

    B.SRLC,

    B.SPTAB,

    B.SOHC,

    B.SPOL,

    case when(B.DLCC > 1110628) then(1110628) else(B.DLCC) end DLCC,

    case when(B.DLCS > 1110628) then(1110628) else(B.DLCS) end DLCS,

    B.MLCC,

    B.MLCS,

    C.PITD1,

    C.PITD2,

    C.CMDTY,

    C.BUYNO,

    C.VIACD,

    C.CHGN,

    J.JDWRVC,

    J.JDWDTC,

    J.JCDRER,

    J.JBMRVC,

    J.JBMDTC,

    J.JCBRER,

    COG.DEDT,

    D.XXMENV

    FROM

    ITMRVA A

    JOIN SITMST E

    ON E.STID = A.STID

    LEFT JOIN ITMRVB B

    ON A.STID = B.STID AND A.ITNBR = B.ITNBR AND A.ITRV = B.ITRV

    LEFT JOIN ITMRVC C

    ON A.STID = C.STID AND A.ITNBR = C.ITNBR AND A.ITRV = C.ITRV

    LEFT JOIN ITMRVJ J

    ON A.STID = J.JSTID AND A.ITNBR = J.ITNBR AND A.ITRV = J.JITRV

    LEFT JOIN JLGF.COGMAT COG

    ON COG.STID = A.STID AND COG.ITNBR = A.ITNBR and COG.ITRV = A.ITRV

    LEFT JOIN XFRRMT D

    ON D.XXTOWH = A.STID

    WHERE

    ( E.UUSA = '1' OR ((A.STID BETWEEN 'E01' AND 'E99') or (A.STID BETWEEN 'EAA' AND 'EZZ')))

    AND

    (

    (

    (A.CHDE > 1110627 AND A.CHDE < 1110628)

    OR

    (

    (A.CHDE = 1110627 AND A.CHTE >= '230000')

    OR

    (A.CHDE = 1110628 AND A.CHTE <= '230000')

    )

    )

    OR

    (

    (B.CHDE > 1110627 AND B.CHDE < 1110628)

    OR

    (

    (B.CHDE = 1110627 AND B.CHTE >= '230000')

    OR

    (B.CHDE = 1110628 AND B.CHTE <= '230000')

    )

    )

    OR

    (B.DLCC >= 1110627 AND B.DLCC <= 1110628)

    OR

    (B.DLCS >= 1110627 AND B.DLCS <= 1110628)

    OR

    (

    (C.CHDE > 1110627 AND C.CHDE < 1110628)

    OR

    (

    (C.CHDE = 1110627 AND C.CHTE >= '230000')

    OR

    (C.CHDE = 1110628 AND C.CHTE <= '230000')

    )

    )

    OR

    (

    (J.JCHDE > 1110627 AND J.JCHDE < 1110628)

    OR

    (

    (J.JCHDE = 1110627 AND J.JCHTE >= '230000')

    OR

    (J.JCHDE = 1110628 AND J.JCHTE <= '230000')

    )

    )

    )

  • My sympathy for your problem. but it is impossible to make any recommendations from just looking at this one query. There are many details that need to be considered before someone could make any meaningful recommendation.

    I doubt that a simple change to the query will have any significant effect, many because of all the ORs. You may be able to partition the data differently or break the one query up into two or three small queries, or even make use of indexed views

    Poor designs only show their ugly nature when the tables start getting large.

    -- edit: sorry, just noticed you are talking about DB2 not SQL Server. My generic advice is not far off though, I think you will have to dig into the database design a bit to fix it.

    The probability of survival is inversely proportional to the angle of arrival.

  • i had ideas like sturner to fiddle with, but i have no idea whether db2 supports some of the things i'd try.

    for example, isntead of selecting the entire tables aliased as A,B,C, i might try moving the WHERE statements fromt eh bottom into a subselect;

    for example this:

    --...

    FROM ITMRVA A

    --CHANGE TO

    FROM ( SELECT ITMRVA.* FROM ITMRVA WHERE (

    (A.CHDE > 1110627 AND A.CHDE < 1110628)

    OR (A.CHDE = 1110627 AND A.CHTE >= '230000')

    OR (A.CHDE = 1110628 AND A.CHTE <= '230000')

    )

    ) A

    another thing i'd change is the BETWEENS for some text search into a simple LIKE:

    --...

    WHERE

    ( E.UUSA = '1' OR ((A.STID BETWEEN 'E01' AND 'E99') or (A.STID BETWEEN 'EAA' AND 'EZZ')))

    --CHANGE TO

    ( E.UUSA = '1' OR ((A.STID LIKE 'E[0-9,A-Z][0-9,A-Z]'))

    the stuff i suggest might perfrom the same way anyway...but it's a couple of ideas.

    Lowell


    --help us help you! If you post a question, make sure you include a CREATE TABLE... statement and INSERT INTO... statement into that table to give the volunteers here representative data. with your description of the problem, we can provide a tested, verifiable solution to your question! asking the question the right way gets you a tested answer the fastest way possible!

  • Well since this is pulling from a source system I can not modify anything on that end. The only hope I have of making this run faster/more efficiently is to rework the query. All the tables have hundreds of millions of rows in, so pulling in the whole table every run isn't an option either. I was more so hoping to tune this as best as possible, knowing that the world I work in isn't ideal at the moment.

  • Thanks I will give those a try.

  • something else that struck me:

    (A.CHDE > 1110627 AND A.CHDE < 1110628)

    -- this is a Julian-ish date, century 111 month 06 day 27

    --isn't that exactly the so someDatefield =1110627 instead?

    (A.CHDE = 1110627)

    /code]

    edit... i mean not equal whops and edited

    actually, unless this is parameterized for a range, the first part would ALWAYS find nothing...no integers are > and also < the same limit, right.? so that section should be removed so there's one less evealuation in each of the 4 sub-clauses for the tables statemetns

    Lowell


    --help us help you! If you post a question, make sure you include a CREATE TABLE... statement and INSERT INTO... statement into that table to give the volunteers here representative data. with your description of the problem, we can provide a tested, verifiable solution to your question! asking the question the right way gets you a tested answer the fastest way possible!

  • Wouldn't the <> bring in records before June 27 as well?

  • ok the more i look at this the more i get a glimmer of udnerstanding;

    i think in that table two columsn represent theDate and TheTime.

    CHDE = 1110627 --the date: 2011 06 27

    CHTE = '230000' --the time? 23:00:00?

    so the whole thing is just trying to limit the data to a days worth of data...

    Lowell


    --help us help you! If you post a question, make sure you include a CREATE TABLE... statement and INSERT INTO... statement into that table to give the volunteers here representative data. with your description of the problem, we can provide a tested, verifiable solution to your question! asking the question the right way gets you a tested answer the fastest way possible!

  • CORRECT

  • i doubt this helps much, but it was fun to look at;

    i hate nested OR's and parenthesusi, i get dizzy trying to look through them.

    can you compare this query to your original? does it return the same results? I think in trying to simplify, i may have lost something...tough to do without test data to validate agaisnt.

    SELECT

    A.CRDT, A.STID, A.ITNBR, A.ITRV, A.CFST,

    A.INVFG, A.ITDSC, A.ENGNO, A.UNMSR, A.ITTYP,

    A.ITCLS, A.VNDNR, A.WEGHT, A.PTAXI, A.STAXI,

    A.UUSA, A.UUSB, A.UUSC, A.UUCA, A.UUCB,

    A.UUCC, A.UUQ1, A.UUA1, A.UUD1, A.UU25,

    A.UU40, A.UVHC, A.UVMC, A.SHAP, A.HITE,

    A.LONG, A.IDIA, A.ODIA, A.DUOM, A.WIDE,

    B.CHDE, B.LOTSZ, B.RCFLG, B.STDUC, B.CURUC,

    B.CURSU, B.CTECH, B.SLTAB, B.CLTAB, B.LABHR,

    B.CSTAC, B.CSTAS, B.UUSA, B.UUSB, B.UUSC,

    B.UUCA, B.UUCB, B.UUCC, B.UUQ1, B.UUA1,

    B.UUD1, B.UU40, B.CMAT, B.COOT, B.CPOT,

    B.CSLT, B.CRLT, B.CSMT, B.CRMT, B.COHT,

    B.CO1T, B.CO2T, B.CO3T, B.CO4T, B.CMAL,

    B.COOL, B.CPOL, B.CSLL, B.CRLL, B.CSML,

    B.CRML, B.COHL, B.CO1L, B.CO2L, B.CO3L,

    B.CO4L, B.CRLC, B.CPTAB, B.COHC, B.SMAT,

    B.SOOT, B.SPOT, B.SSLT, B.SRLT, B.SSMT,

    B.SRMT, B.SOHT, B.SO1T, B.SO2T, B.SO3T,

    B.SO4T, B.SMAL, B.SOOL, B.SSLL, B.SRLL,

    B.SSML, B.SRML, B.SOHL, B.SO1L, B.SO2L,

    B.SO3L, B.SO4L, B.SRLC, B.SPTAB, B.SOHC,

    B.SPOL,

    case when(B.DLCC > 1110628) then(1110628) else(B.DLCC) end DLCC,

    case when(B.DLCS > 1110628) then(1110628) else(B.DLCS) end DLCS,

    B.MLCC, B.MLCS, C.PITD1, C.PITD2, C.CMDTY,

    C.BUYNO, C.VIACD, C.CHGN, J.JDWRVC, J.JDWDTC,

    J.JCDRER, J.JBMRVC, J.JBMDTC,J.JCBRER, COG.DEDT,

    D.XXMENV

    --in theory, thos should limit a LOT of the rows for the query

    FROM (SELECT * FROM ITMRVA WHERE CHDE IN (1110627,1110628) ) A

    INNER JOIN SITMST E ON E.STID = A.STID

    LEFT JOIN (SELECT * FROM ITMRVB WHERE CHDE IN (1110627,1110628) ) B

    ON A.STID = B.STID

    AND A.ITNBR = B.ITNBR

    AND A.ITRV = B.ITRV

    LEFT JOIN (SELECT * FROM ITMRVC WHERE CHDE IN (1110627,1110628) ) C

    ON A.STID = C.STID

    AND A.ITNBR = C.ITNBR

    AND A.ITRV = C.ITRV

    LEFT JOIN (SELECT * FROM ITMRVJ WHERE CHDE IN (1110627,1110628) ) J

    ON A.STID = J.JSTID

    AND A.ITNBR = J.ITNBR

    AND A.ITRV = J.JITRV

    LEFT JOIN JLGF.COGMAT COG ON COG.STID = A.STID

    AND COG.ITNBR = A.ITNBR

    AND COG.ITRV = A.ITRV

    LEFT JOIN XFRRMT D ON D.XXTOWH = A.STID

    WHERE

    ( E.UUSA = '1' OR (A.STID LIKE 'E[0-9,A-Z][0-9,A-Z]'))

    AND

    (

    (

    (

    (A.CHDE = 1110627 AND A.CHTE >= '230000')

    OR

    (A.CHDE = 1110628 AND A.CHTE <= '230000')

    )

    )

    OR

    (

    (

    (B.CHDE = 1110627 AND B.CHTE >= '230000')

    OR

    (B.CHDE = 1110628 AND B.CHTE <= '230000')

    )

    )

    OR

    (B.DLCC >= 1110627 AND B.DLCC <= 1110628)

    OR

    (B.DLCS >= 1110627 AND B.DLCS <= 1110628)

    OR

    (

    (

    (C.CHDE = 1110627 AND C.CHTE >= '230000')

    OR

    (C.CHDE = 1110628 AND C.CHTE <= '230000')

    )

    )

    OR

    (

    (

    (J.JCHDE = 1110627 AND J.JCHTE >= '230000')

    OR

    (J.JCHDE = 1110628 AND J.JCHTE <= '230000')

    )

    )

    )

    Lowell


    --help us help you! If you post a question, make sure you include a CREATE TABLE... statement and INSERT INTO... statement into that table to give the volunteers here representative data. with your description of the problem, we can provide a tested, verifiable solution to your question! asking the question the right way gets you a tested answer the fastest way possible!

  • Thanks, I will give it a try

  • It ran fast as compared to the other query, however its count was 1710 as compared to 2286...

Viewing 12 posts - 1 through 11 (of 11 total)

You must be logged in to reply to this topic. Login to reply