Query Performance Impact due to FOR XML PATH and Large Data

  • Hi Experts,

    i'm running the following query and it takes large amount of time to execute, the source data has  23 million records., Please advise if there's  better way to do it.  Takes nearly 3 hours and 48mins to process


    Select [ATTRIBUTE_SET_ID],[IndividualMaterialByIDResponse_sync_V1ID],[ClassNumber],[CharacteristicName],[ClassType],[CharacteristicValue]= Stuff((Select '|' +[CharacteristicValue] from [dbo].[GEMS_CIX_EQ_C] t1
    where T1.[ATTRIBUTE_SET_ID] =T2.[ATTRIBUTE_SET_ID] AND t1.[IndividualMaterialByIDResponse_sync_V1ID] = t2.[IndividualMaterialByIDResponse_sync_V1ID] AND
    T1.ClassNumber = T2.ClassNumber AND T1.[CharacteristicName] = T2.CharacteristicName AND T1.ClassType = T2.ClassType
    for XML PATH ('')),1,1,'')
    Into [TMP].[SSIS_EQ_AVF_C]
    from [dbo].[GEMS_CIX_EQ_C] t2 ---where t2.ATTRIBUTE_SET_ID = 'GNCLS2'
    group by [ATTRIBUTE_SET_ID],[IndividualMaterialByIDResponse_sync_V1ID],[ClassNumber],[CharacteristicName],[ClassType]

    Thank you.
    LW

  • could you post the indexes and execution plan?

    ***The first step is always the hardest *******

  • That's because you're using a correlated subquery in SELECT part of the query.

    Try to make it into CROSS APPLY.

    _____________
    Code for TallyGenerator

  • I notice you have a GROUP BY clause at the end, how many rows are there typically in table GEMS_CIX_EQ_C for the combination of the 4 columns in the GROUP BY?  Is there an index on the 4 columns in the order they are specified in the GROUP BY?  It's difficult to give specific advice without knowing the execution plan and indexes as has already been mentioned.

  • SGT_squeequal - Wednesday, July 18, 2018 1:22 AM

    could you post the indexes and execution plan?

    I have attached the exe plan.  I have created a non unique, non clustered index, I cannot use a unique index because the source table has duplicate entries however characteristic value is same or different for those entries, then to be merged as a single row.

  • Chris Harshman - Wednesday, July 18, 2018 6:45 AM

    I notice you have a GROUP BY clause at the end, how many rows are there typically in table GEMS_CIX_EQ_C for the combination of the 4 columns in the GROUP BY?  Is there an index on the 4 columns in the order they are specified in the GROUP BY?  It's difficult to give specific advice without knowing the execution plan and indexes as has already been mentioned.

    Source table has nearly 20 million rows to be processed.  What i'm trying to do is if there are any duplicate rows it should merge with the characteristic value adding a '|' in between.

    I have specified index for the  [IndividualMaterialByIDResponse_sync_V1ID]  column, should i created index for all columns in group by clause?
    Attached the exe plan.

    Thank you.

  • Sergiy - Wednesday, July 18, 2018 2:42 AM

    That's because you're using a correlated subquery in SELECT part of the query.Try to make it into CROSS APPLY.

    I'm not sure how to do this.. 🙁  if you can add a sample code...

  • SELECT ..., t3.ValueList

    from [dbo].[GEMS_CIX_EQ_C] t2

    Cross apply ( select STUFF(....) from [dbo].[GEMS_CIX_EQ_C] t1

    Where ....

    ) t3 (ValueList)

    Group by ...

    _____________
    Code for TallyGenerator

Viewing 8 posts - 1 through 7 (of 7 total)

You must be logged in to reply to this topic. Login to reply