Query Tuning

  • Hi,

    Can you please suggest what can be done to tune the following query ?

    SELECT 
    A.accounting_program_code ACCT_PGM_CD,
    A.apply_3_ownership_level_limit_indicator APP_3_OWNSHP_LVL_LMT_IND,
    A.commodity_code CMDY_CD,
    A.county_fsa_code CNTY_FSA_CD,
    A.creation_date CRE_DT,
    A.creation_user_name CRE_USER_NM,
    A.data_status_code DATA_STAT_CD,
    A.decimal_precision_number DEC_PRCS_NBR,
    A.hierarchical_payment_limitation_indicator HRCHL_PYMT_LMT_IND,
    A.last_change_date LAST_CHG_DT,
    A.last_change_user_name LAST_CHG_USER_NM,
    A.override_subsidiary_period_start_year OVRRD_SBSD_PRD_STRT_YR,
    Paid_subsidiary_customer.core_customer_identifier PAID_CORE_CUST_ID,
    paid_fsa_county.state_county_fsa_code PAID_CORE_CUST_ST_CNTY_FSA_CD,
    A.processing_request_reference_number PROC_RQST_REF_NBR,
    A.persist_overpayment_attributions_indicator PRST_OPYMT_ATRB_IND,
    B.accounting_program_code PRV_PRPS_PYMT_ACCT_PGM_CD,
    B.county_fsa_code PRV_PRPS_PYMT_CNTY_FSA_CD,
    A.previous_B_identifier PRV_PRPS_PYMT_ ID,
    B.subsidiary_period_start_year PRV_PRPS_PYMT_SBSD_PRD_STRT_YR,
    B.state_fsa_code PRV_PRPS_PYMT_ST_FSA_CD,
    A.payment_amount PYMT_AMT,
    A.payment_attribution_request_reason_text PYMT_ATRB_RQST_RSN_TXT,
    A.payment_attribution_reversed_date PYMT_ATRB_RVRS_DT,
    A.payment_limitation_indicator PYMT_LMT_IND,
    A.payment_limitation_year PYMT_LMT_YR,
    A.limited_payment_program_identifier PYMT_PGM_ID,
    A.record_reference_primary_identification RCD_REF_PRIM_ID,
    A.record_reference_primary_identification_type_code RCD_REF_PRIM_ID_TYPE_CD,
    A.record_reference_secondary_identification RCD_REF_SCND_ID,
    A.record_reference_secondary_identification_type_code RCD_REF_SCND_ID_TYPE_CD,
    A.subsidiary_period_start_year SBSD_PRD_STRT_YR,
    A.state_fsa_code ST_FSA_CD,
    A.A_identifier UCHG_ATRB_PRPS_PYMT_LOG_ID
    FROM <V_SRC_SCHM_NM>.A  (Driving Table)
    left join <V_SRC_SCHM_NM>.B
    on (A.previous_B_identifier =B.B_identifier)
    Left Join 
     (SELECT * 
                FROM
                (SELECT *
                    ,row_number() over (partition by payment_attribution_identifier order by last_change_date desc) as rnk  
                    FROM
                        (SELECT 
                            payment_attribution_identifier,
                            subsidiary_customer_identifier,
                            paid_subsidiary_customer_identifier,
                            B_identifier,
                            last_change_date
                            FROM <V_SRC_SCHM_NM>.C                        where parent_payment_attribution_identifier is NULL
                        UNION
                        SELECT 
                            payment_attribution_identifier,
                            subsidiary_customer_identifier,
                            paid_subsidiary_customer_identifier,
                            B_identifier,
                            last_change_date
                            FROM <V_SRC_SCHM_NM>.D                        where parent_payment_attribution_identifier is NULL
                        ) attr_union
                 )attr_all
                WHERE 1 = 1
                AND rnk = 1
                ) ALIAS 
     on (B.B_identifier =ALIAS.B_identifier)
       LEFT JOIN <V_SRC_SCHM_NM>.F
        ON (ALIAS.paid_subsidiary_customer_identifier = F_customer_identifier)
        /*get paid customer fsa state county information*/
        LEFT JOIN <V_SRC_SCHM_NMv.G
        ON (F_identifier = G_county_identifier)

    I Just changed  the tables names,

    A is the driving table which has 763 Mill Records
    B is the Joining Table which has 104 Million

    I am trying batch load  this table so i am selecting 50 million from the driving table I cant restrict the records in Joins and the joining tables are huge as well and especially below part of the query is killing the performance 

    SELECT * 
                FROM
                (SELECT *
                    ,row_number() over (partition by payment_attribution_identifier order by last_change_date desc) as rnk  
                    FROM
                        (SELECT 
                            payment_attribution_identifier,
                            subsidiary_customer_identifier,
                            paid_subsidiary_customer_identifier,
                            B_identifier,
                            last_change_date
                            FROM <V_SRC_SCHM_NM>.C                        where parent_payment_attribution_identifier is NULL
                        UNION
                        SELECT 
                            payment_attribution_identifier,
                            subsidiary_customer_identifier,
                            paid_subsidiary_customer_identifier,
                            B_identifier,
                            last_change_date
                            FROM <V_SRC_SCHM_NM>.D                        where parent_payment_attribution_identifier is NULL
                        ) attr_union
                 )attr_all
                WHERE 1 = 1
                AND rnk = 1
                ) ALIAS 
     on (B.B_identifier =ALIAS.B_identifier)

    Table where i am doing union and later Rank has 90 Mill and 300 Mill respectively

    Any suggestion son tuning are appreciated, Thank you.

  • masstudy23 - Monday, May 14, 2018 3:03 PM

    Hi,

    Can you please suggest what can be done to tune the following query ?

    SELECT 
    A.accounting_program_code ACCT_PGM_CD,
    A.apply_3_ownership_level_limit_indicator APP_3_OWNSHP_LVL_LMT_IND,
    A.commodity_code CMDY_CD,
    A.county_fsa_code CNTY_FSA_CD,
    A.creation_date CRE_DT,
    A.creation_user_name CRE_USER_NM,
    A.data_status_code DATA_STAT_CD,
    A.decimal_precision_number DEC_PRCS_NBR,
    A.hierarchical_payment_limitation_indicator HRCHL_PYMT_LMT_IND,
    A.last_change_date LAST_CHG_DT,
    A.last_change_user_name LAST_CHG_USER_NM,
    A.override_subsidiary_period_start_year OVRRD_SBSD_PRD_STRT_YR,
    Paid_subsidiary_customer.core_customer_identifier PAID_CORE_CUST_ID,
    paid_fsa_county.state_county_fsa_code PAID_CORE_CUST_ST_CNTY_FSA_CD,
    A.processing_request_reference_number PROC_RQST_REF_NBR,
    A.persist_overpayment_attributions_indicator PRST_OPYMT_ATRB_IND,
    B.accounting_program_code PRV_PRPS_PYMT_ACCT_PGM_CD,
    B.county_fsa_code PRV_PRPS_PYMT_CNTY_FSA_CD,
    A.previous_B_identifier PRV_PRPS_PYMT_ ID,
    B.subsidiary_period_start_year PRV_PRPS_PYMT_SBSD_PRD_STRT_YR,
    B.state_fsa_code PRV_PRPS_PYMT_ST_FSA_CD,
    A.payment_amount PYMT_AMT,
    A.payment_attribution_request_reason_text PYMT_ATRB_RQST_RSN_TXT,
    A.payment_attribution_reversed_date PYMT_ATRB_RVRS_DT,
    A.payment_limitation_indicator PYMT_LMT_IND,
    A.payment_limitation_year PYMT_LMT_YR,
    A.limited_payment_program_identifier PYMT_PGM_ID,
    A.record_reference_primary_identification RCD_REF_PRIM_ID,
    A.record_reference_primary_identification_type_code RCD_REF_PRIM_ID_TYPE_CD,
    A.record_reference_secondary_identification RCD_REF_SCND_ID,
    A.record_reference_secondary_identification_type_code RCD_REF_SCND_ID_TYPE_CD,
    A.subsidiary_period_start_year SBSD_PRD_STRT_YR,
    A.state_fsa_code ST_FSA_CD,
    A.A_identifier UCHG_ATRB_PRPS_PYMT_LOG_ID
    FROM <V_SRC_SCHM_NM>.A  (Driving Table)
    left join <V_SRC_SCHM_NM>.B
    on (A.previous_B_identifier =B.B_identifier)
    Left Join 
     (SELECT * 
                FROM
                (SELECT *
                    ,row_number() over (partition by payment_attribution_identifier order by last_change_date desc) as rnk  
                    FROM
                        (SELECT 
                            payment_attribution_identifier,
                            subsidiary_customer_identifier,
                            paid_subsidiary_customer_identifier,
                            B_identifier,
                            last_change_date
                            FROM <V_SRC_SCHM_NM>.C                        where parent_payment_attribution_identifier is NULL
                        UNION
                        SELECT 
                            payment_attribution_identifier,
                            subsidiary_customer_identifier,
                            paid_subsidiary_customer_identifier,
                            B_identifier,
                            last_change_date
                            FROM <V_SRC_SCHM_NM>.D                        where parent_payment_attribution_identifier is NULL
                        ) attr_union
                 )attr_all
                WHERE 1 = 1
                AND rnk = 1
                ) ALIAS 
     on (B.B_identifier =ALIAS.B_identifier)
       LEFT JOIN <V_SRC_SCHM_NM>.F
        ON (ALIAS.paid_subsidiary_customer_identifier = F_customer_identifier)
        /*get paid customer fsa state county information*/
        LEFT JOIN <V_SRC_SCHM_NMv.G
        ON (F_identifier = G_county_identifier)

    I Just changed  the tables names,

    A is the driving table which has 763 Mill Records
    B is the Joining Table which has 104 Million

    I am trying batch load  this table so i am selecting 50 million from the driving table I cant restrict the records in Joins and the joining tables are huge as well and especially below part of the query is killing the performance 

    SELECT * 
                FROM
                (SELECT *
                    ,row_number() over (partition by payment_attribution_identifier order by last_change_date desc) as rnk  
                    FROM
                        (SELECT 
                            payment_attribution_identifier,
                            subsidiary_customer_identifier,
                            paid_subsidiary_customer_identifier,
                            B_identifier,
                            last_change_date
                            FROM <V_SRC_SCHM_NM>.C                        where parent_payment_attribution_identifier is NULL
                        UNION
                        SELECT 
                            payment_attribution_identifier,
                            subsidiary_customer_identifier,
                            paid_subsidiary_customer_identifier,
                            B_identifier,
                            last_change_date
                            FROM <V_SRC_SCHM_NM>.D                        where parent_payment_attribution_identifier is NULL
                        ) attr_union
                 )attr_all
                WHERE 1 = 1
                AND rnk = 1
                ) ALIAS 
     on (B.B_identifier =ALIAS.B_identifier)

    Table where i am doing union and later Rank has 90 Mill and 300 Mill respectively

    Any suggestion son tuning are appreciated, Thank you.

    Start with creating a temp table to hold the most recent "last_change_date" for each payment_attribution_identifier, once for each of the two different tables, and then join to that temp table on those 2 columns within the union portion of the query, and see if that helps.

    Steve (aka sgmunson) 🙂 🙂 🙂
    Rent Servers for Income (picks and shovels strategy)

  • sgmunson - Monday, May 14, 2018 3:24 PM

    masstudy23 - Monday, May 14, 2018 3:03 PM

    Hi,

    Can you please suggest what can be done to tune the following query ?

    SELECT 
    A.accounting_program_code ACCT_PGM_CD,
    A.apply_3_ownership_level_limit_indicator APP_3_OWNSHP_LVL_LMT_IND,
    A.commodity_code CMDY_CD,
    A.county_fsa_code CNTY_FSA_CD,
    A.creation_date CRE_DT,
    A.creation_user_name CRE_USER_NM,
    A.data_status_code DATA_STAT_CD,
    A.decimal_precision_number DEC_PRCS_NBR,
    A.hierarchical_payment_limitation_indicator HRCHL_PYMT_LMT_IND,
    A.last_change_date LAST_CHG_DT,
    A.last_change_user_name LAST_CHG_USER_NM,
    A.override_subsidiary_period_start_year OVRRD_SBSD_PRD_STRT_YR,
    Paid_subsidiary_customer.core_customer_identifier PAID_CORE_CUST_ID,
    paid_fsa_county.state_county_fsa_code PAID_CORE_CUST_ST_CNTY_FSA_CD,
    A.processing_request_reference_number PROC_RQST_REF_NBR,
    A.persist_overpayment_attributions_indicator PRST_OPYMT_ATRB_IND,
    B.accounting_program_code PRV_PRPS_PYMT_ACCT_PGM_CD,
    B.county_fsa_code PRV_PRPS_PYMT_CNTY_FSA_CD,
    A.previous_B_identifier PRV_PRPS_PYMT_ ID,
    B.subsidiary_period_start_year PRV_PRPS_PYMT_SBSD_PRD_STRT_YR,
    B.state_fsa_code PRV_PRPS_PYMT_ST_FSA_CD,
    A.payment_amount PYMT_AMT,
    A.payment_attribution_request_reason_text PYMT_ATRB_RQST_RSN_TXT,
    A.payment_attribution_reversed_date PYMT_ATRB_RVRS_DT,
    A.payment_limitation_indicator PYMT_LMT_IND,
    A.payment_limitation_year PYMT_LMT_YR,
    A.limited_payment_program_identifier PYMT_PGM_ID,
    A.record_reference_primary_identification RCD_REF_PRIM_ID,
    A.record_reference_primary_identification_type_code RCD_REF_PRIM_ID_TYPE_CD,
    A.record_reference_secondary_identification RCD_REF_SCND_ID,
    A.record_reference_secondary_identification_type_code RCD_REF_SCND_ID_TYPE_CD,
    A.subsidiary_period_start_year SBSD_PRD_STRT_YR,
    A.state_fsa_code ST_FSA_CD,
    A.A_identifier UCHG_ATRB_PRPS_PYMT_LOG_ID
    FROM <V_SRC_SCHM_NM>.A  (Driving Table)
    left join <V_SRC_SCHM_NM>.B
    on (A.previous_B_identifier =B.B_identifier)
    Left Join 
     (SELECT * 
                FROM
                (SELECT *
                    ,row_number() over (partition by payment_attribution_identifier order by last_change_date desc) as rnk  
                    FROM
                        (SELECT 
                            payment_attribution_identifier,
                            subsidiary_customer_identifier,
                            paid_subsidiary_customer_identifier,
                            B_identifier,
                            last_change_date
                            FROM <V_SRC_SCHM_NM>.C                        where parent_payment_attribution_identifier is NULL
                        UNION
                        SELECT 
                            payment_attribution_identifier,
                            subsidiary_customer_identifier,
                            paid_subsidiary_customer_identifier,
                            B_identifier,
                            last_change_date
                            FROM <V_SRC_SCHM_NM>.D                        where parent_payment_attribution_identifier is NULL
                        ) attr_union
                 )attr_all
                WHERE 1 = 1
                AND rnk = 1
                ) ALIAS 
     on (B.B_identifier =ALIAS.B_identifier)
       LEFT JOIN <V_SRC_SCHM_NM>.F
        ON (ALIAS.paid_subsidiary_customer_identifier = F_customer_identifier)
        /*get paid customer fsa state county information*/
        LEFT JOIN <V_SRC_SCHM_NMv.G
        ON (F_identifier = G_county_identifier)

    I Just changed  the tables names,

    A is the driving table which has 763 Mill Records
    B is the Joining Table which has 104 Million

    I am trying batch load  this table so i am selecting 50 million from the driving table I cant restrict the records in Joins and the joining tables are huge as well and especially below part of the query is killing the performance 

    SELECT * 
                FROM
                (SELECT *
                    ,row_number() over (partition by payment_attribution_identifier order by last_change_date desc) as rnk  
                    FROM
                        (SELECT 
                            payment_attribution_identifier,
                            subsidiary_customer_identifier,
                            paid_subsidiary_customer_identifier,
                            B_identifier,
                            last_change_date
                            FROM <V_SRC_SCHM_NM>.C                        where parent_payment_attribution_identifier is NULL
                        UNION
                        SELECT 
                            payment_attribution_identifier,
                            subsidiary_customer_identifier,
                            paid_subsidiary_customer_identifier,
                            B_identifier,
                            last_change_date
                            FROM <V_SRC_SCHM_NM>.D                        where parent_payment_attribution_identifier is NULL
                        ) attr_union
                 )attr_all
                WHERE 1 = 1
                AND rnk = 1
                ) ALIAS 
     on (B.B_identifier =ALIAS.B_identifier)

    Table where i am doing union and later Rank has 90 Mill and 300 Mill respectively

    Any suggestion son tuning are appreciated, Thank you.

    Start with creating a temp table to hold the most recent "last_change_date" for each payment_attribution_identifier, once for each of the two different tables, and then join to that temp table on those 2 columns within the union portion of the query, and see if that helps.

    Steve,

    Thank you for your suggestion, I followed following steps based on your suggestion,

    --CREATE temp Tables

    select payment_attribution_identifier, max(last_change_date) as last_change_date into #PAI from
    subsidiary_DP.dbo.payment_attribution group by payment_attribution_identifier

    select payment_attribution_identifier, max(last_change_date) as last_change_date into #PAIA from
    subsidiary_DP.dbo.payment_attribution_audit group by payment_attribution_identifier

    -- UNION TEMP Tables

    SELECT payment_attribution_identifier, last_change_date from into # PAI_FINAL(
    select payment_attribution_identifier, max(last_change_date) as last_change_date into #PAI from
    <Table A> group by payment_attribution_identifier
    UNION
    select payment_attribution_identifier, max(last_change_date) as last_change_date into #PAIA from
    <Table B> group by payment_attribution_identifier)
    group by payment_attribution_identifier

    -- Join Temp Table to get otjer columns from PA table
    SELECT
    PA.payment_attribution_identifier,
    PA.subsidiary_customer_identifier,
    PA.paid_subsidiary_customer_identifier,
    PA.proposed_payment_identifier,
    PA.last_change_date INTO #PAI2 FROM
    #PAI T1
    JOIN
    <Table A> PA ON
    T1.payment_attribution_identifier=PA.payment_attribution_identifier
    AND parent_payment_attribution_identifier is NULL

    SELECT * FROm #PAI2

    -- Join Temp Table to get other columns from PAA table
    SELECT
    T1.payment_attribution_identifier,
    PA.subsidiary_customer_identifier,
    PA.paid_subsidiary_customer_identifier,
    PA.proposed_payment_identifier,
    T1.last_change_date INTO #PAI3 FROM
    #PAIA T1
    LEFT JOIN
    <Table B> PA ON
    T1.payment_attribution_identifier=PA.payment_attribution_identifier
    AND T1.last_change_date=PA.last_change_date
    AND parent_payment_attribution_identifier is NULL --(32135783 row(s) affected)

    -- UNIOn Temp Tables,
    Select * INTo #PAI_FINAL from
        (SELECT *,row_number() over (partition by payment_attribution_identifier order by last_change_date desc) as rnk
        FROM (
            SELECT * FROM #PAI2
            UNION
            SELECT * FROM #PAI3
            ) PA_ALL
    ) PA_ALL_R WHERE rnk = 1

    This entire process completed in 15-20 min, can you suggest how we can put this into a single query so that i can use it in the main query ? Thank you

  • masstudy23 - Monday, May 14, 2018 10:17 PM

    sgmunson - Monday, May 14, 2018 3:24 PM

    masstudy23 - Monday, May 14, 2018 3:03 PM

    Hi,

    Can you please suggest what can be done to tune the following query ?

    SELECT 
    A.accounting_program_code ACCT_PGM_CD,
    A.apply_3_ownership_level_limit_indicator APP_3_OWNSHP_LVL_LMT_IND,
    A.commodity_code CMDY_CD,
    A.county_fsa_code CNTY_FSA_CD,
    A.creation_date CRE_DT,
    A.creation_user_name CRE_USER_NM,
    A.data_status_code DATA_STAT_CD,
    A.decimal_precision_number DEC_PRCS_NBR,
    A.hierarchical_payment_limitation_indicator HRCHL_PYMT_LMT_IND,
    A.last_change_date LAST_CHG_DT,
    A.last_change_user_name LAST_CHG_USER_NM,
    A.override_subsidiary_period_start_year OVRRD_SBSD_PRD_STRT_YR,
    Paid_subsidiary_customer.core_customer_identifier PAID_CORE_CUST_ID,
    paid_fsa_county.state_county_fsa_code PAID_CORE_CUST_ST_CNTY_FSA_CD,
    A.processing_request_reference_number PROC_RQST_REF_NBR,
    A.persist_overpayment_attributions_indicator PRST_OPYMT_ATRB_IND,
    B.accounting_program_code PRV_PRPS_PYMT_ACCT_PGM_CD,
    B.county_fsa_code PRV_PRPS_PYMT_CNTY_FSA_CD,
    A.previous_B_identifier PRV_PRPS_PYMT_ ID,
    B.subsidiary_period_start_year PRV_PRPS_PYMT_SBSD_PRD_STRT_YR,
    B.state_fsa_code PRV_PRPS_PYMT_ST_FSA_CD,
    A.payment_amount PYMT_AMT,
    A.payment_attribution_request_reason_text PYMT_ATRB_RQST_RSN_TXT,
    A.payment_attribution_reversed_date PYMT_ATRB_RVRS_DT,
    A.payment_limitation_indicator PYMT_LMT_IND,
    A.payment_limitation_year PYMT_LMT_YR,
    A.limited_payment_program_identifier PYMT_PGM_ID,
    A.record_reference_primary_identification RCD_REF_PRIM_ID,
    A.record_reference_primary_identification_type_code RCD_REF_PRIM_ID_TYPE_CD,
    A.record_reference_secondary_identification RCD_REF_SCND_ID,
    A.record_reference_secondary_identification_type_code RCD_REF_SCND_ID_TYPE_CD,
    A.subsidiary_period_start_year SBSD_PRD_STRT_YR,
    A.state_fsa_code ST_FSA_CD,
    A.A_identifier UCHG_ATRB_PRPS_PYMT_LOG_ID
    FROM <V_SRC_SCHM_NM>.A  (Driving Table)
    left join <V_SRC_SCHM_NM>.B
    on (A.previous_B_identifier =B.B_identifier)
    Left Join 
     (SELECT * 
                FROM
                (SELECT *
                    ,row_number() over (partition by payment_attribution_identifier order by last_change_date desc) as rnk  
                    FROM
                        (SELECT 
                            payment_attribution_identifier,
                            subsidiary_customer_identifier,
                            paid_subsidiary_customer_identifier,
                            B_identifier,
                            last_change_date
                            FROM <V_SRC_SCHM_NM>.C                        where parent_payment_attribution_identifier is NULL
                        UNION
                        SELECT 
                            payment_attribution_identifier,
                            subsidiary_customer_identifier,
                            paid_subsidiary_customer_identifier,
                            B_identifier,
                            last_change_date
                            FROM <V_SRC_SCHM_NM>.D                        where parent_payment_attribution_identifier is NULL
                        ) attr_union
                 )attr_all
                WHERE 1 = 1
                AND rnk = 1
                ) ALIAS 
     on (B.B_identifier =ALIAS.B_identifier)
       LEFT JOIN <V_SRC_SCHM_NM>.F
        ON (ALIAS.paid_subsidiary_customer_identifier = F_customer_identifier)
        /*get paid customer fsa state county information*/
        LEFT JOIN <V_SRC_SCHM_NMv.G
        ON (F_identifier = G_county_identifier)

    I Just changed  the tables names,

    A is the driving table which has 763 Mill Records
    B is the Joining Table which has 104 Million

    I am trying batch load  this table so i am selecting 50 million from the driving table I cant restrict the records in Joins and the joining tables are huge as well and especially below part of the query is killing the performance 

    SELECT * 
                FROM
                (SELECT *
                    ,row_number() over (partition by payment_attribution_identifier order by last_change_date desc) as rnk  
                    FROM
                        (SELECT 
                            payment_attribution_identifier,
                            subsidiary_customer_identifier,
                            paid_subsidiary_customer_identifier,
                            B_identifier,
                            last_change_date
                            FROM <V_SRC_SCHM_NM>.C                        where parent_payment_attribution_identifier is NULL
                        UNION
                        SELECT 
                            payment_attribution_identifier,
                            subsidiary_customer_identifier,
                            paid_subsidiary_customer_identifier,
                            B_identifier,
                            last_change_date
                            FROM <V_SRC_SCHM_NM>.D                        where parent_payment_attribution_identifier is NULL
                        ) attr_union
                 )attr_all
                WHERE 1 = 1
                AND rnk = 1
                ) ALIAS 
     on (B.B_identifier =ALIAS.B_identifier)

    Table where i am doing union and later Rank has 90 Mill and 300 Mill respectively

    Any suggestion son tuning are appreciated, Thank you.

    Start with creating a temp table to hold the most recent "last_change_date" for each payment_attribution_identifier, once for each of the two different tables, and then join to that temp table on those 2 columns within the union portion of the query, and see if that helps.

    Steve,

    Thank you for your suggestion, I followed following steps based on your suggestion,

    --CREATE temp Tables

    select payment_attribution_identifier, max(last_change_date) as last_change_date into #PAI from
    subsidiary_DP.dbo.payment_attribution group by payment_attribution_identifier

    select payment_attribution_identifier, max(last_change_date) as last_change_date into #PAIA from
    subsidiary_DP.dbo.payment_attribution_audit group by payment_attribution_identifier

    -- UNION TEMP Tables

    SELECT payment_attribution_identifier, last_change_date from into # PAI_FINAL(
    select payment_attribution_identifier, max(last_change_date) as last_change_date into #PAI from
    <Table A> group by payment_attribution_identifier
    UNION
    select payment_attribution_identifier, max(last_change_date) as last_change_date into #PAIA from
    <Table B> group by payment_attribution_identifier)
    group by payment_attribution_identifier

    -- Join Temp Table to get otjer columns from PA table
    SELECT
    PA.payment_attribution_identifier,
    PA.subsidiary_customer_identifier,
    PA.paid_subsidiary_customer_identifier,
    PA.proposed_payment_identifier,
    PA.last_change_date INTO #PAI2 FROM
    #PAI T1
    JOIN
    <Table A> PA ON
    T1.payment_attribution_identifier=PA.payment_attribution_identifier
    AND parent_payment_attribution_identifier is NULL

    SELECT * FROm #PAI2

    -- Join Temp Table to get other columns from PAA table
    SELECT
    T1.payment_attribution_identifier,
    PA.subsidiary_customer_identifier,
    PA.paid_subsidiary_customer_identifier,
    PA.proposed_payment_identifier,
    T1.last_change_date INTO #PAI3 FROM
    #PAIA T1
    LEFT JOIN
    <Table B> PA ON
    T1.payment_attribution_identifier=PA.payment_attribution_identifier
    AND T1.last_change_date=PA.last_change_date
    AND parent_payment_attribution_identifier is NULL --(32135783 row(s) affected)

    -- UNIOn Temp Tables,
    Select * INTo #PAI_FINAL from
        (SELECT *,row_number() over (partition by payment_attribution_identifier order by last_change_date desc) as rnk
        FROM (
            SELECT * FROM #PAI2
            UNION
            SELECT * FROM #PAI3
            ) PA_ALL
    ) PA_ALL_R WHERE rnk = 1

    This entire process completed in 15-20 min, can you suggest how we can put this into a single query so that i can use it in the main query ? Thank you

    How many rows of data are being processed?
    You may need to see if there is anything more that you divide and conquer before doing your final query.  Doing things all in one query isn't always the best the way to solve a problem.  Sometimes to you have to break it down into smaller indivual pieces.

    Also, use the SQL Code blocks to allow yourself to post formatted SQL code. It will make your code easier to read by others.

  • masstudy23 - Monday, May 14, 2018 10:17 PM

    sgmunson - Monday, May 14, 2018 3:24 PM

    masstudy23 - Monday, May 14, 2018 3:03 PM

    Hi,

    Can you please suggest what can be done to tune the following query ?

    SELECT 
    A.accounting_program_code ACCT_PGM_CD,
    A.apply_3_ownership_level_limit_indicator APP_3_OWNSHP_LVL_LMT_IND,
    A.commodity_code CMDY_CD,
    A.county_fsa_code CNTY_FSA_CD,
    A.creation_date CRE_DT,
    A.creation_user_name CRE_USER_NM,
    A.data_status_code DATA_STAT_CD,
    A.decimal_precision_number DEC_PRCS_NBR,
    A.hierarchical_payment_limitation_indicator HRCHL_PYMT_LMT_IND,
    A.last_change_date LAST_CHG_DT,
    A.last_change_user_name LAST_CHG_USER_NM,
    A.override_subsidiary_period_start_year OVRRD_SBSD_PRD_STRT_YR,
    Paid_subsidiary_customer.core_customer_identifier PAID_CORE_CUST_ID,
    paid_fsa_county.state_county_fsa_code PAID_CORE_CUST_ST_CNTY_FSA_CD,
    A.processing_request_reference_number PROC_RQST_REF_NBR,
    A.persist_overpayment_attributions_indicator PRST_OPYMT_ATRB_IND,
    B.accounting_program_code PRV_PRPS_PYMT_ACCT_PGM_CD,
    B.county_fsa_code PRV_PRPS_PYMT_CNTY_FSA_CD,
    A.previous_B_identifier PRV_PRPS_PYMT_ ID,
    B.subsidiary_period_start_year PRV_PRPS_PYMT_SBSD_PRD_STRT_YR,
    B.state_fsa_code PRV_PRPS_PYMT_ST_FSA_CD,
    A.payment_amount PYMT_AMT,
    A.payment_attribution_request_reason_text PYMT_ATRB_RQST_RSN_TXT,
    A.payment_attribution_reversed_date PYMT_ATRB_RVRS_DT,
    A.payment_limitation_indicator PYMT_LMT_IND,
    A.payment_limitation_year PYMT_LMT_YR,
    A.limited_payment_program_identifier PYMT_PGM_ID,
    A.record_reference_primary_identification RCD_REF_PRIM_ID,
    A.record_reference_primary_identification_type_code RCD_REF_PRIM_ID_TYPE_CD,
    A.record_reference_secondary_identification RCD_REF_SCND_ID,
    A.record_reference_secondary_identification_type_code RCD_REF_SCND_ID_TYPE_CD,
    A.subsidiary_period_start_year SBSD_PRD_STRT_YR,
    A.state_fsa_code ST_FSA_CD,
    A.A_identifier UCHG_ATRB_PRPS_PYMT_LOG_ID
    FROM <V_SRC_SCHM_NM>.A  (Driving Table)
    left join <V_SRC_SCHM_NM>.B
    on (A.previous_B_identifier =B.B_identifier)
    Left Join 
     (SELECT * 
                FROM
                (SELECT *
                    ,row_number() over (partition by payment_attribution_identifier order by last_change_date desc) as rnk  
                    FROM
                        (SELECT 
                            payment_attribution_identifier,
                            subsidiary_customer_identifier,
                            paid_subsidiary_customer_identifier,
                            B_identifier,
                            last_change_date
                            FROM <V_SRC_SCHM_NM>.C                        where parent_payment_attribution_identifier is NULL
                        UNION
                        SELECT 
                            payment_attribution_identifier,
                            subsidiary_customer_identifier,
                            paid_subsidiary_customer_identifier,
                            B_identifier,
                            last_change_date
                            FROM <V_SRC_SCHM_NM>.D                        where parent_payment_attribution_identifier is NULL
                        ) attr_union
                 )attr_all
                WHERE 1 = 1
                AND rnk = 1
                ) ALIAS 
     on (B.B_identifier =ALIAS.B_identifier)
       LEFT JOIN <V_SRC_SCHM_NM>.F
        ON (ALIAS.paid_subsidiary_customer_identifier = F_customer_identifier)
        /*get paid customer fsa state county information*/
        LEFT JOIN <V_SRC_SCHM_NMv.G
        ON (F_identifier = G_county_identifier)

    I Just changed  the tables names,

    A is the driving table which has 763 Mill Records
    B is the Joining Table which has 104 Million

    I am trying batch load  this table so i am selecting 50 million from the driving table I cant restrict the records in Joins and the joining tables are huge as well and especially below part of the query is killing the performance 

    SELECT * 
                FROM
                (SELECT *
                    ,row_number() over (partition by payment_attribution_identifier order by last_change_date desc) as rnk  
                    FROM
                        (SELECT 
                            payment_attribution_identifier,
                            subsidiary_customer_identifier,
                            paid_subsidiary_customer_identifier,
                            B_identifier,
                            last_change_date
                            FROM <V_SRC_SCHM_NM>.C                        where parent_payment_attribution_identifier is NULL
                        UNION
                        SELECT 
                            payment_attribution_identifier,
                            subsidiary_customer_identifier,
                            paid_subsidiary_customer_identifier,
                            B_identifier,
                            last_change_date
                            FROM <V_SRC_SCHM_NM>.D                        where parent_payment_attribution_identifier is NULL
                        ) attr_union
                 )attr_all
                WHERE 1 = 1
                AND rnk = 1
                ) ALIAS 
     on (B.B_identifier =ALIAS.B_identifier)

    Table where i am doing union and later Rank has 90 Mill and 300 Mill respectively

    Any suggestion son tuning are appreciated, Thank you.

    Start with creating a temp table to hold the most recent "last_change_date" for each payment_attribution_identifier, once for each of the two different tables, and then join to that temp table on those 2 columns within the union portion of the query, and see if that helps.

    Steve,

    Thank you for your suggestion, I followed following steps based on your suggestion,

    --CREATE temp Tables

    select payment_attribution_identifier, max(last_change_date) as last_change_date into #PAI from
    subsidiary_DP.dbo.payment_attribution group by payment_attribution_identifier

    select payment_attribution_identifier, max(last_change_date) as last_change_date into #PAIA from
    subsidiary_DP.dbo.payment_attribution_audit group by payment_attribution_identifier


    -- UNION TEMP Tables

    SELECT payment_attribution_identifier, last_change_date from into # PAI_FINAL(
    select payment_attribution_identifier, max(last_change_date) as last_change_date into #PAI from
    <Table A> group by payment_attribution_identifier
    UNION
    select payment_attribution_identifier, max(last_change_date) as last_change_date into #PAIA from
    <Table B> group by payment_attribution_identifier)
    group by payment_attribution_identifier

    -- Join Temp Table to get otjer columns from PA table
    SELECT
    PA.payment_attribution_identifier,
    PA.subsidiary_customer_identifier,
    PA.paid_subsidiary_customer_identifier,
    PA.proposed_payment_identifier,
    PA.last_change_date INTO #PAI2 FROM
    #PAI T1
    JOIN
    <Table A> PA ON
    T1.payment_attribution_identifier=PA.payment_attribution_identifier
    AND parent_payment_attribution_identifier is NULL

    SELECT * FROm #PAI2

    -- Join Temp Table to get other columns from PAA table
    SELECT
    T1.payment_attribution_identifier,
    PA.subsidiary_customer_identifier,
    PA.paid_subsidiary_customer_identifier,
    PA.proposed_payment_identifier,
    T1.last_change_date INTO #PAI3 FROM
    #PAIA T1
    LEFT JOIN
    <Table B> PA ON
    T1.payment_attribution_identifier=PA.payment_attribution_identifier
    AND T1.last_change_date=PA.last_change_date
    AND parent_payment_attribution_identifier is NULL --(32135783 row(s) affected)

    -- UNIOn Temp Tables,
    Select * INTo #PAI_FINAL from
        (SELECT *,row_number() over (partition by payment_attribution_identifier order by last_change_date desc) as rnk
        FROM (
            SELECT * FROM #PAI2
            UNION
            SELECT * FROM #PAI3
            ) PA_ALL
    ) PA_ALL_R WHERE rnk = 1

    This entire process completed in 15-20 min, can you suggest how we can put this into a single query so that i can use it in the main query ? Thank you

    What I was suggesting encompasses the portion of your code that I made bold and italicized with a yellow background.   You then use the rest of your query, and in each of the two unioned pieces of your original query, you join to the appropriate temp table using an INNER JOIN.   No need to try and UNION the two temp tables, as that just goes further down the rabbit hole, and introduces another possibility - that you get two different max date values for the same identifier.  You may be best keeping those entirely separated, but without complete knowledge of your data, it's impossible to know for sure.   And do follow Lynn's advice.   One query is not always a good idea, and should not be any kind of guiding principal.  If need be, a stored procedure can always be created to overcome any limitations in some kind of tool that requires a single query.   It's quite often best to break down large problems into smaller pieces that can more easily be handled in smaller chunks.

    Steve (aka sgmunson) 🙂 🙂 🙂
    Rent Servers for Income (picks and shovels strategy)

Viewing 5 posts - 1 through 4 (of 4 total)

You must be logged in to reply to this topic. Login to reply