• eseosaoregie (4/13/2013)


    I am currently working on project where we are extracting data from some AX tables. The data from these extracts is transformed in vary ways to be consumed by various applications. One of the of the transformations requires many column values to be concatenated into one column values. The query that is currently used to do this was written by a contractor who has now left. It uses the PATH mode FOR XML Path to concatenate values. The query currently takes 24 minutes to run. We are currently in development but ideally in production this data needs to be extracted every 7 minutes. I would just like to know if there is a way I can tune the query or rewrite it in another way that could speed up the retrieval. The source tables contain about 4- 5 million records.

    The query is shown below

    SELECT i.ITEMID AS CRItemID, i.PSTPRODUCTID AS CRProductID, i.NAMEALIAS AS CRName, i.CREATIONPERFORMER AS CRPerformer,

    i.CREATIONDURATION AS CRDuration, i.CREATIONSTATUS AS CRStatus, i.[MODIFIEDDATETIME] AS CRModifiedDateTime,

    STUFF

    ((SELECT ' ; ' + n.CRNUMBERTYPEID + '|#|' + n.CRNUMBER + CASE n.CRSOCIETYCODE WHEN '0' THEN '' ELSE '|#|' + n.CRSOCIETYCODE END

    FROM CRALTNUMBERS n

    WHERE n.CREATIONID = I.Itemid FOR XML PATH('')), 1, 3, '') AS CRNumbers,

    STUFF

    ((SELECT ' ; ' + t .CRTITLE

    FROM CRALTTITLES t

    WHERE t .CREATIONID = I.Itemid FOR XML PATH('')), 1, 3, '') AS CRTitles,

    STUFF

    ((SELECT ' ; ' + LTRIM(m.FIRSTNAME + ' ' + m.NAME)

    FROM MATCHNAME m

    WHERE m.CREATIONID = I.Itemid FOR XML PATH('')), 1, 3, '') AS CRNames,

    STUFF

    ((SELECT ' ; ' + b.BOMID

    FROM BOM b

    WHERE b.ITEMID = I.Itemid FOR XML PATH('')), 1, 3, '') AS CRParents,

    STUFF

    ((SELECT ' ; ' + b.ITEMID

    FROM BOM b

    WHERE b.BOMID = I.Itemid FOR XML PATH('')), 1, 3, '') AS CRChildren,

    STUFF

    ((SELECT ' ; ' + p.DATA

    FROM PSTPRODTMPLDATA p

    WHERE p.TEMPLATEID = I.Itemid AND p.SPECID = 'Episode title' FOR XML PATH('')), 1, 3, '') AS CREpisodeTitle,

    STUFF

    ((SELECT ' ; ' + p.DATA

    FROM PSTPRODTMPLDATA p

    WHERE p.TEMPLATEID = I.Itemid AND p.SPECID = 'Commercial brand' FOR XML PATH('')), 1, 3, '') AS CRComBrand,

    STUFF

    ((SELECT ' ; ' + p.DATA

    FROM PSTPRODTMPLDATA p

    WHERE p.TEMPLATEID = I.Itemid AND p.SPECID = 'Commercial product' FOR XML PATH('')), 1, 3, '') AS CRComProduct,

    STUFF

    ((SELECT ' ; ' + p.DATA

    FROM PSTPRODTMPLDATA p

    WHERE p.TEMPLATEID = I.Itemid AND p.SPECID = 'Commercial type' FOR XML PATH('')), 1, 3, '') AS CRComType,

    STUFF

    ((SELECT ' ; ' + p.DATA

    FROM PSTPRODTMPLDATA p

    WHERE p.TEMPLATEID = I.Itemid AND p.SPECID = 'Country of origin' FOR XML PATH('')), 1, 3, '') AS CRCountry,

    STUFF

    ((SELECT ' ; ' + p.DATA

    FROM PSTPRODTMPLDATA p

    WHERE p.TEMPLATEID = I.Itemid AND p.SPECID = 'Year of production' FOR XML PATH('')), 1, 3, '') AS CRYear

    FROMINVENTTABLE i

    I have also attached a copy of the execution plan of the latest query run. Any help would be much appreciated.

    The major problem with that query is that it recalculates the colon separate list for like rows which is a huge waste of resources. Using "Divide'n'Conquer" methods, a separate table should be calculated to hold single instances of each concatenation grouped by I.ItemID and p.SPECID using a single query with a GROUP BY to build such a thing.

    The same is also true of the other lookup tables (like the BOM table). The same information is concatenated over and over and over for each row.

    The key to performance on this problem will be to correctly "pre-aggregate" the concatenations in separate tables and then join to those tables.

    To wit, even the introduction of a CLR to do the concatenation might not be as performant as it could be because it would still have to do the concatentation of identical data using the current structure of the current query.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)