May 14, 2018 at 3:03 pm
Hi,
Can you please suggest what can be done to tune the following query ?
SELECT
A.accounting_program_code ACCT_PGM_CD,
A.apply_3_ownership_level_limit_indicator APP_3_OWNSHP_LVL_LMT_IND,
A.commodity_code CMDY_CD,
A.county_fsa_code CNTY_FSA_CD,
A.creation_date CRE_DT,
A.creation_user_name CRE_USER_NM,
A.data_status_code DATA_STAT_CD,
A.decimal_precision_number DEC_PRCS_NBR,
A.hierarchical_payment_limitation_indicator HRCHL_PYMT_LMT_IND,
A.last_change_date LAST_CHG_DT,
A.last_change_user_name LAST_CHG_USER_NM,
A.override_subsidiary_period_start_year OVRRD_SBSD_PRD_STRT_YR,
Paid_subsidiary_customer.core_customer_identifier PAID_CORE_CUST_ID,
paid_fsa_county.state_county_fsa_code PAID_CORE_CUST_ST_CNTY_FSA_CD,
A.processing_request_reference_number PROC_RQST_REF_NBR,
A.persist_overpayment_attributions_indicator PRST_OPYMT_ATRB_IND,
B.accounting_program_code PRV_PRPS_PYMT_ACCT_PGM_CD,
B.county_fsa_code PRV_PRPS_PYMT_CNTY_FSA_CD,
A.previous_B_identifier PRV_PRPS_PYMT_ ID,
B.subsidiary_period_start_year PRV_PRPS_PYMT_SBSD_PRD_STRT_YR,
B.state_fsa_code PRV_PRPS_PYMT_ST_FSA_CD,
A.payment_amount PYMT_AMT,
A.payment_attribution_request_reason_text PYMT_ATRB_RQST_RSN_TXT,
A.payment_attribution_reversed_date PYMT_ATRB_RVRS_DT,
A.payment_limitation_indicator PYMT_LMT_IND,
A.payment_limitation_year PYMT_LMT_YR,
A.limited_payment_program_identifier PYMT_PGM_ID,
A.record_reference_primary_identification RCD_REF_PRIM_ID,
A.record_reference_primary_identification_type_code RCD_REF_PRIM_ID_TYPE_CD,
A.record_reference_secondary_identification RCD_REF_SCND_ID,
A.record_reference_secondary_identification_type_code RCD_REF_SCND_ID_TYPE_CD,
A.subsidiary_period_start_year SBSD_PRD_STRT_YR,
A.state_fsa_code ST_FSA_CD,
A.A_identifier UCHG_ATRB_PRPS_PYMT_LOG_ID
FROM <V_SRC_SCHM_NM>.A (Driving Table)
left join <V_SRC_SCHM_NM>.B
on (A.previous_B_identifier =B.B_identifier)
Left Join
(SELECT *
FROM
(SELECT *
,row_number() over (partition by payment_attribution_identifier order by last_change_date desc) as rnk
FROM
(SELECT
payment_attribution_identifier,
subsidiary_customer_identifier,
paid_subsidiary_customer_identifier,
B_identifier,
last_change_date
FROM <V_SRC_SCHM_NM>.C where parent_payment_attribution_identifier is NULL
UNION
SELECT
payment_attribution_identifier,
subsidiary_customer_identifier,
paid_subsidiary_customer_identifier,
B_identifier,
last_change_date
FROM <V_SRC_SCHM_NM>.D where parent_payment_attribution_identifier is NULL
) attr_union
)attr_all
WHERE 1 = 1
AND rnk = 1
) ALIAS
on (B.B_identifier =ALIAS.B_identifier)
LEFT JOIN <V_SRC_SCHM_NM>.F
ON (ALIAS.paid_subsidiary_customer_identifier = F_customer_identifier)
/*get paid customer fsa state county information*/
LEFT JOIN <V_SRC_SCHM_NMv.G
ON (F_identifier = G_county_identifier)
I Just changed the tables names,
A is the driving table which has 763 Mill Records
B is the Joining Table which has 104 Million
I am trying batch load this table so i am selecting 50 million from the driving table I cant restrict the records in Joins and the joining tables are huge as well and especially below part of the query is killing the performance
SELECT *
FROM
(SELECT *
,row_number() over (partition by payment_attribution_identifier order by last_change_date desc) as rnk
FROM
(SELECT
payment_attribution_identifier,
subsidiary_customer_identifier,
paid_subsidiary_customer_identifier,
B_identifier,
last_change_date
FROM <V_SRC_SCHM_NM>.C where parent_payment_attribution_identifier is NULL
UNION
SELECT
payment_attribution_identifier,
subsidiary_customer_identifier,
paid_subsidiary_customer_identifier,
B_identifier,
last_change_date
FROM <V_SRC_SCHM_NM>.D where parent_payment_attribution_identifier is NULL
) attr_union
)attr_all
WHERE 1 = 1
AND rnk = 1
) ALIAS
on (B.B_identifier =ALIAS.B_identifier)
Table where i am doing union and later Rank has 90 Mill and 300 Mill respectively
Any suggestion son tuning are appreciated, Thank you.
May 14, 2018 at 3:24 pm
masstudy23 - Monday, May 14, 2018 3:03 PMHi,Can you please suggest what can be done to tune the following query ?
SELECT
A.accounting_program_code ACCT_PGM_CD,
A.apply_3_ownership_level_limit_indicator APP_3_OWNSHP_LVL_LMT_IND,
A.commodity_code CMDY_CD,
A.county_fsa_code CNTY_FSA_CD,
A.creation_date CRE_DT,
A.creation_user_name CRE_USER_NM,
A.data_status_code DATA_STAT_CD,
A.decimal_precision_number DEC_PRCS_NBR,
A.hierarchical_payment_limitation_indicator HRCHL_PYMT_LMT_IND,
A.last_change_date LAST_CHG_DT,
A.last_change_user_name LAST_CHG_USER_NM,
A.override_subsidiary_period_start_year OVRRD_SBSD_PRD_STRT_YR,
Paid_subsidiary_customer.core_customer_identifier PAID_CORE_CUST_ID,
paid_fsa_county.state_county_fsa_code PAID_CORE_CUST_ST_CNTY_FSA_CD,
A.processing_request_reference_number PROC_RQST_REF_NBR,
A.persist_overpayment_attributions_indicator PRST_OPYMT_ATRB_IND,
B.accounting_program_code PRV_PRPS_PYMT_ACCT_PGM_CD,
B.county_fsa_code PRV_PRPS_PYMT_CNTY_FSA_CD,
A.previous_B_identifier PRV_PRPS_PYMT_ ID,
B.subsidiary_period_start_year PRV_PRPS_PYMT_SBSD_PRD_STRT_YR,
B.state_fsa_code PRV_PRPS_PYMT_ST_FSA_CD,
A.payment_amount PYMT_AMT,
A.payment_attribution_request_reason_text PYMT_ATRB_RQST_RSN_TXT,
A.payment_attribution_reversed_date PYMT_ATRB_RVRS_DT,
A.payment_limitation_indicator PYMT_LMT_IND,
A.payment_limitation_year PYMT_LMT_YR,
A.limited_payment_program_identifier PYMT_PGM_ID,
A.record_reference_primary_identification RCD_REF_PRIM_ID,
A.record_reference_primary_identification_type_code RCD_REF_PRIM_ID_TYPE_CD,
A.record_reference_secondary_identification RCD_REF_SCND_ID,
A.record_reference_secondary_identification_type_code RCD_REF_SCND_ID_TYPE_CD,
A.subsidiary_period_start_year SBSD_PRD_STRT_YR,
A.state_fsa_code ST_FSA_CD,
A.A_identifier UCHG_ATRB_PRPS_PYMT_LOG_ID
FROM <V_SRC_SCHM_NM>.A (Driving Table)
left join <V_SRC_SCHM_NM>.B
on (A.previous_B_identifier =B.B_identifier)
Left Join
(SELECT *
FROM
(SELECT *
,row_number() over (partition by payment_attribution_identifier order by last_change_date desc) as rnk
FROM
(SELECT
payment_attribution_identifier,
subsidiary_customer_identifier,
paid_subsidiary_customer_identifier,
B_identifier,
last_change_date
FROM <V_SRC_SCHM_NM>.C where parent_payment_attribution_identifier is NULL
UNION
SELECT
payment_attribution_identifier,
subsidiary_customer_identifier,
paid_subsidiary_customer_identifier,
B_identifier,
last_change_date
FROM <V_SRC_SCHM_NM>.D where parent_payment_attribution_identifier is NULL
) attr_union
)attr_all
WHERE 1 = 1
AND rnk = 1
) ALIAS
on (B.B_identifier =ALIAS.B_identifier)
LEFT JOIN <V_SRC_SCHM_NM>.F
ON (ALIAS.paid_subsidiary_customer_identifier = F_customer_identifier)
/*get paid customer fsa state county information*/
LEFT JOIN <V_SRC_SCHM_NMv.G
ON (F_identifier = G_county_identifier)I Just changed the tables names,
A is the driving table which has 763 Mill Records
B is the Joining Table which has 104 MillionI am trying batch load this table so i am selecting 50 million from the driving table I cant restrict the records in Joins and the joining tables are huge as well and especially below part of the query is killing the performance
SELECT *
FROM
(SELECT *
,row_number() over (partition by payment_attribution_identifier order by last_change_date desc) as rnk
FROM
(SELECT
payment_attribution_identifier,
subsidiary_customer_identifier,
paid_subsidiary_customer_identifier,
B_identifier,
last_change_date
FROM <V_SRC_SCHM_NM>.C where parent_payment_attribution_identifier is NULL
UNION
SELECT
payment_attribution_identifier,
subsidiary_customer_identifier,
paid_subsidiary_customer_identifier,
B_identifier,
last_change_date
FROM <V_SRC_SCHM_NM>.D where parent_payment_attribution_identifier is NULL
) attr_union
)attr_all
WHERE 1 = 1
AND rnk = 1
) ALIAS
on (B.B_identifier =ALIAS.B_identifier)Table where i am doing union and later Rank has 90 Mill and 300 Mill respectively
Any suggestion son tuning are appreciated, Thank you.
Start with creating a temp table to hold the most recent "last_change_date" for each payment_attribution_identifier, once for each of the two different tables, and then join to that temp table on those 2 columns within the union portion of the query, and see if that helps.
Steve (aka sgmunson) 🙂 🙂 🙂
Rent Servers for Income (picks and shovels strategy)
May 14, 2018 at 10:17 pm
sgmunson - Monday, May 14, 2018 3:24 PMmasstudy23 - Monday, May 14, 2018 3:03 PMHi,Can you please suggest what can be done to tune the following query ?
SELECT
A.accounting_program_code ACCT_PGM_CD,
A.apply_3_ownership_level_limit_indicator APP_3_OWNSHP_LVL_LMT_IND,
A.commodity_code CMDY_CD,
A.county_fsa_code CNTY_FSA_CD,
A.creation_date CRE_DT,
A.creation_user_name CRE_USER_NM,
A.data_status_code DATA_STAT_CD,
A.decimal_precision_number DEC_PRCS_NBR,
A.hierarchical_payment_limitation_indicator HRCHL_PYMT_LMT_IND,
A.last_change_date LAST_CHG_DT,
A.last_change_user_name LAST_CHG_USER_NM,
A.override_subsidiary_period_start_year OVRRD_SBSD_PRD_STRT_YR,
Paid_subsidiary_customer.core_customer_identifier PAID_CORE_CUST_ID,
paid_fsa_county.state_county_fsa_code PAID_CORE_CUST_ST_CNTY_FSA_CD,
A.processing_request_reference_number PROC_RQST_REF_NBR,
A.persist_overpayment_attributions_indicator PRST_OPYMT_ATRB_IND,
B.accounting_program_code PRV_PRPS_PYMT_ACCT_PGM_CD,
B.county_fsa_code PRV_PRPS_PYMT_CNTY_FSA_CD,
A.previous_B_identifier PRV_PRPS_PYMT_ ID,
B.subsidiary_period_start_year PRV_PRPS_PYMT_SBSD_PRD_STRT_YR,
B.state_fsa_code PRV_PRPS_PYMT_ST_FSA_CD,
A.payment_amount PYMT_AMT,
A.payment_attribution_request_reason_text PYMT_ATRB_RQST_RSN_TXT,
A.payment_attribution_reversed_date PYMT_ATRB_RVRS_DT,
A.payment_limitation_indicator PYMT_LMT_IND,
A.payment_limitation_year PYMT_LMT_YR,
A.limited_payment_program_identifier PYMT_PGM_ID,
A.record_reference_primary_identification RCD_REF_PRIM_ID,
A.record_reference_primary_identification_type_code RCD_REF_PRIM_ID_TYPE_CD,
A.record_reference_secondary_identification RCD_REF_SCND_ID,
A.record_reference_secondary_identification_type_code RCD_REF_SCND_ID_TYPE_CD,
A.subsidiary_period_start_year SBSD_PRD_STRT_YR,
A.state_fsa_code ST_FSA_CD,
A.A_identifier UCHG_ATRB_PRPS_PYMT_LOG_ID
FROM <V_SRC_SCHM_NM>.A (Driving Table)
left join <V_SRC_SCHM_NM>.B
on (A.previous_B_identifier =B.B_identifier)
Left Join
(SELECT *
FROM
(SELECT *
,row_number() over (partition by payment_attribution_identifier order by last_change_date desc) as rnk
FROM
(SELECT
payment_attribution_identifier,
subsidiary_customer_identifier,
paid_subsidiary_customer_identifier,
B_identifier,
last_change_date
FROM <V_SRC_SCHM_NM>.C where parent_payment_attribution_identifier is NULL
UNION
SELECT
payment_attribution_identifier,
subsidiary_customer_identifier,
paid_subsidiary_customer_identifier,
B_identifier,
last_change_date
FROM <V_SRC_SCHM_NM>.D where parent_payment_attribution_identifier is NULL
) attr_union
)attr_all
WHERE 1 = 1
AND rnk = 1
) ALIAS
on (B.B_identifier =ALIAS.B_identifier)
LEFT JOIN <V_SRC_SCHM_NM>.F
ON (ALIAS.paid_subsidiary_customer_identifier = F_customer_identifier)
/*get paid customer fsa state county information*/
LEFT JOIN <V_SRC_SCHM_NMv.G
ON (F_identifier = G_county_identifier)I Just changed the tables names,
A is the driving table which has 763 Mill Records
B is the Joining Table which has 104 MillionI am trying batch load this table so i am selecting 50 million from the driving table I cant restrict the records in Joins and the joining tables are huge as well and especially below part of the query is killing the performance
SELECT *
FROM
(SELECT *
,row_number() over (partition by payment_attribution_identifier order by last_change_date desc) as rnk
FROM
(SELECT
payment_attribution_identifier,
subsidiary_customer_identifier,
paid_subsidiary_customer_identifier,
B_identifier,
last_change_date
FROM <V_SRC_SCHM_NM>.C where parent_payment_attribution_identifier is NULL
UNION
SELECT
payment_attribution_identifier,
subsidiary_customer_identifier,
paid_subsidiary_customer_identifier,
B_identifier,
last_change_date
FROM <V_SRC_SCHM_NM>.D where parent_payment_attribution_identifier is NULL
) attr_union
)attr_all
WHERE 1 = 1
AND rnk = 1
) ALIAS
on (B.B_identifier =ALIAS.B_identifier)Table where i am doing union and later Rank has 90 Mill and 300 Mill respectively
Any suggestion son tuning are appreciated, Thank you.
Start with creating a temp table to hold the most recent "last_change_date" for each payment_attribution_identifier, once for each of the two different tables, and then join to that temp table on those 2 columns within the union portion of the query, and see if that helps.
Steve,
Thank you for your suggestion, I followed following steps based on your suggestion,
--CREATE temp Tables
select payment_attribution_identifier, max(last_change_date) as last_change_date into #PAI from
subsidiary_DP.dbo.payment_attribution group by payment_attribution_identifier
select payment_attribution_identifier, max(last_change_date) as last_change_date into #PAIA from
subsidiary_DP.dbo.payment_attribution_audit group by payment_attribution_identifier
-- UNION TEMP Tables
SELECT payment_attribution_identifier, last_change_date from into # PAI_FINAL(
select payment_attribution_identifier, max(last_change_date) as last_change_date into #PAI from
<Table A> group by payment_attribution_identifier
UNION
select payment_attribution_identifier, max(last_change_date) as last_change_date into #PAIA from
<Table B> group by payment_attribution_identifier)
group by payment_attribution_identifier
-- Join Temp Table to get otjer columns from PA table
SELECT
PA.payment_attribution_identifier,
PA.subsidiary_customer_identifier,
PA.paid_subsidiary_customer_identifier,
PA.proposed_payment_identifier,
PA.last_change_date INTO #PAI2 FROM
#PAI T1
JOIN
<Table A> PA ON
T1.payment_attribution_identifier=PA.payment_attribution_identifier
AND parent_payment_attribution_identifier is NULL
SELECT * FROm #PAI2
-- Join Temp Table to get other columns from PAA table
SELECT
T1.payment_attribution_identifier,
PA.subsidiary_customer_identifier,
PA.paid_subsidiary_customer_identifier,
PA.proposed_payment_identifier,
T1.last_change_date INTO #PAI3 FROM
#PAIA T1
LEFT JOIN
<Table B> PA ON
T1.payment_attribution_identifier=PA.payment_attribution_identifier
AND T1.last_change_date=PA.last_change_date
AND parent_payment_attribution_identifier is NULL --(32135783 row(s) affected)
-- UNIOn Temp Tables,
Select * INTo #PAI_FINAL from
(SELECT *,row_number() over (partition by payment_attribution_identifier order by last_change_date desc) as rnk
FROM (
SELECT * FROM #PAI2
UNION
SELECT * FROM #PAI3
) PA_ALL
) PA_ALL_R WHERE rnk = 1
This entire process completed in 15-20 min, can you suggest how we can put this into a single query so that i can use it in the main query ? Thank you
May 14, 2018 at 11:03 pm
masstudy23 - Monday, May 14, 2018 10:17 PMsgmunson - Monday, May 14, 2018 3:24 PMmasstudy23 - Monday, May 14, 2018 3:03 PMHi,Can you please suggest what can be done to tune the following query ?
SELECT
A.accounting_program_code ACCT_PGM_CD,
A.apply_3_ownership_level_limit_indicator APP_3_OWNSHP_LVL_LMT_IND,
A.commodity_code CMDY_CD,
A.county_fsa_code CNTY_FSA_CD,
A.creation_date CRE_DT,
A.creation_user_name CRE_USER_NM,
A.data_status_code DATA_STAT_CD,
A.decimal_precision_number DEC_PRCS_NBR,
A.hierarchical_payment_limitation_indicator HRCHL_PYMT_LMT_IND,
A.last_change_date LAST_CHG_DT,
A.last_change_user_name LAST_CHG_USER_NM,
A.override_subsidiary_period_start_year OVRRD_SBSD_PRD_STRT_YR,
Paid_subsidiary_customer.core_customer_identifier PAID_CORE_CUST_ID,
paid_fsa_county.state_county_fsa_code PAID_CORE_CUST_ST_CNTY_FSA_CD,
A.processing_request_reference_number PROC_RQST_REF_NBR,
A.persist_overpayment_attributions_indicator PRST_OPYMT_ATRB_IND,
B.accounting_program_code PRV_PRPS_PYMT_ACCT_PGM_CD,
B.county_fsa_code PRV_PRPS_PYMT_CNTY_FSA_CD,
A.previous_B_identifier PRV_PRPS_PYMT_ ID,
B.subsidiary_period_start_year PRV_PRPS_PYMT_SBSD_PRD_STRT_YR,
B.state_fsa_code PRV_PRPS_PYMT_ST_FSA_CD,
A.payment_amount PYMT_AMT,
A.payment_attribution_request_reason_text PYMT_ATRB_RQST_RSN_TXT,
A.payment_attribution_reversed_date PYMT_ATRB_RVRS_DT,
A.payment_limitation_indicator PYMT_LMT_IND,
A.payment_limitation_year PYMT_LMT_YR,
A.limited_payment_program_identifier PYMT_PGM_ID,
A.record_reference_primary_identification RCD_REF_PRIM_ID,
A.record_reference_primary_identification_type_code RCD_REF_PRIM_ID_TYPE_CD,
A.record_reference_secondary_identification RCD_REF_SCND_ID,
A.record_reference_secondary_identification_type_code RCD_REF_SCND_ID_TYPE_CD,
A.subsidiary_period_start_year SBSD_PRD_STRT_YR,
A.state_fsa_code ST_FSA_CD,
A.A_identifier UCHG_ATRB_PRPS_PYMT_LOG_ID
FROM <V_SRC_SCHM_NM>.A (Driving Table)
left join <V_SRC_SCHM_NM>.B
on (A.previous_B_identifier =B.B_identifier)
Left Join
(SELECT *
FROM
(SELECT *
,row_number() over (partition by payment_attribution_identifier order by last_change_date desc) as rnk
FROM
(SELECT
payment_attribution_identifier,
subsidiary_customer_identifier,
paid_subsidiary_customer_identifier,
B_identifier,
last_change_date
FROM <V_SRC_SCHM_NM>.C where parent_payment_attribution_identifier is NULL
UNION
SELECT
payment_attribution_identifier,
subsidiary_customer_identifier,
paid_subsidiary_customer_identifier,
B_identifier,
last_change_date
FROM <V_SRC_SCHM_NM>.D where parent_payment_attribution_identifier is NULL
) attr_union
)attr_all
WHERE 1 = 1
AND rnk = 1
) ALIAS
on (B.B_identifier =ALIAS.B_identifier)
LEFT JOIN <V_SRC_SCHM_NM>.F
ON (ALIAS.paid_subsidiary_customer_identifier = F_customer_identifier)
/*get paid customer fsa state county information*/
LEFT JOIN <V_SRC_SCHM_NMv.G
ON (F_identifier = G_county_identifier)I Just changed the tables names,
A is the driving table which has 763 Mill Records
B is the Joining Table which has 104 MillionI am trying batch load this table so i am selecting 50 million from the driving table I cant restrict the records in Joins and the joining tables are huge as well and especially below part of the query is killing the performance
SELECT *
FROM
(SELECT *
,row_number() over (partition by payment_attribution_identifier order by last_change_date desc) as rnk
FROM
(SELECT
payment_attribution_identifier,
subsidiary_customer_identifier,
paid_subsidiary_customer_identifier,
B_identifier,
last_change_date
FROM <V_SRC_SCHM_NM>.C where parent_payment_attribution_identifier is NULL
UNION
SELECT
payment_attribution_identifier,
subsidiary_customer_identifier,
paid_subsidiary_customer_identifier,
B_identifier,
last_change_date
FROM <V_SRC_SCHM_NM>.D where parent_payment_attribution_identifier is NULL
) attr_union
)attr_all
WHERE 1 = 1
AND rnk = 1
) ALIAS
on (B.B_identifier =ALIAS.B_identifier)Table where i am doing union and later Rank has 90 Mill and 300 Mill respectively
Any suggestion son tuning are appreciated, Thank you.
Start with creating a temp table to hold the most recent "last_change_date" for each payment_attribution_identifier, once for each of the two different tables, and then join to that temp table on those 2 columns within the union portion of the query, and see if that helps.
Steve,
Thank you for your suggestion, I followed following steps based on your suggestion,
--CREATE temp Tables
select payment_attribution_identifier, max(last_change_date) as last_change_date into #PAI from
subsidiary_DP.dbo.payment_attribution group by payment_attribution_identifierselect payment_attribution_identifier, max(last_change_date) as last_change_date into #PAIA from
subsidiary_DP.dbo.payment_attribution_audit group by payment_attribution_identifier-- UNION TEMP Tables
SELECT payment_attribution_identifier, last_change_date from into # PAI_FINAL(
select payment_attribution_identifier, max(last_change_date) as last_change_date into #PAI from
<Table A> group by payment_attribution_identifier
UNION
select payment_attribution_identifier, max(last_change_date) as last_change_date into #PAIA from
<Table B> group by payment_attribution_identifier)
group by payment_attribution_identifier-- Join Temp Table to get otjer columns from PA table
SELECT
PA.payment_attribution_identifier,
PA.subsidiary_customer_identifier,
PA.paid_subsidiary_customer_identifier,
PA.proposed_payment_identifier,
PA.last_change_date INTO #PAI2 FROM
#PAI T1
JOIN
<Table A> PA ON
T1.payment_attribution_identifier=PA.payment_attribution_identifier
AND parent_payment_attribution_identifier is NULLSELECT * FROm #PAI2
-- Join Temp Table to get other columns from PAA table
SELECT
T1.payment_attribution_identifier,
PA.subsidiary_customer_identifier,
PA.paid_subsidiary_customer_identifier,
PA.proposed_payment_identifier,
T1.last_change_date INTO #PAI3 FROM
#PAIA T1
LEFT JOIN
<Table B> PA ON
T1.payment_attribution_identifier=PA.payment_attribution_identifier
AND T1.last_change_date=PA.last_change_date
AND parent_payment_attribution_identifier is NULL --(32135783 row(s) affected)-- UNIOn Temp Tables,
Select * INTo #PAI_FINAL from
(SELECT *,row_number() over (partition by payment_attribution_identifier order by last_change_date desc) as rnk
FROM (
SELECT * FROM #PAI2
UNION
SELECT * FROM #PAI3
) PA_ALL
) PA_ALL_R WHERE rnk = 1This entire process completed in 15-20 min, can you suggest how we can put this into a single query so that i can use it in the main query ? Thank you
How many rows of data are being processed?
You may need to see if there is anything more that you divide and conquer before doing your final query. Doing things all in one query isn't always the best the way to solve a problem. Sometimes to you have to break it down into smaller indivual pieces.
Also, use the SQL Code blocks to allow yourself to post formatted SQL code. It will make your code easier to read by others.
May 15, 2018 at 6:56 am
masstudy23 - Monday, May 14, 2018 10:17 PMsgmunson - Monday, May 14, 2018 3:24 PMmasstudy23 - Monday, May 14, 2018 3:03 PMHi,Can you please suggest what can be done to tune the following query ?
SELECT
A.accounting_program_code ACCT_PGM_CD,
A.apply_3_ownership_level_limit_indicator APP_3_OWNSHP_LVL_LMT_IND,
A.commodity_code CMDY_CD,
A.county_fsa_code CNTY_FSA_CD,
A.creation_date CRE_DT,
A.creation_user_name CRE_USER_NM,
A.data_status_code DATA_STAT_CD,
A.decimal_precision_number DEC_PRCS_NBR,
A.hierarchical_payment_limitation_indicator HRCHL_PYMT_LMT_IND,
A.last_change_date LAST_CHG_DT,
A.last_change_user_name LAST_CHG_USER_NM,
A.override_subsidiary_period_start_year OVRRD_SBSD_PRD_STRT_YR,
Paid_subsidiary_customer.core_customer_identifier PAID_CORE_CUST_ID,
paid_fsa_county.state_county_fsa_code PAID_CORE_CUST_ST_CNTY_FSA_CD,
A.processing_request_reference_number PROC_RQST_REF_NBR,
A.persist_overpayment_attributions_indicator PRST_OPYMT_ATRB_IND,
B.accounting_program_code PRV_PRPS_PYMT_ACCT_PGM_CD,
B.county_fsa_code PRV_PRPS_PYMT_CNTY_FSA_CD,
A.previous_B_identifier PRV_PRPS_PYMT_ ID,
B.subsidiary_period_start_year PRV_PRPS_PYMT_SBSD_PRD_STRT_YR,
B.state_fsa_code PRV_PRPS_PYMT_ST_FSA_CD,
A.payment_amount PYMT_AMT,
A.payment_attribution_request_reason_text PYMT_ATRB_RQST_RSN_TXT,
A.payment_attribution_reversed_date PYMT_ATRB_RVRS_DT,
A.payment_limitation_indicator PYMT_LMT_IND,
A.payment_limitation_year PYMT_LMT_YR,
A.limited_payment_program_identifier PYMT_PGM_ID,
A.record_reference_primary_identification RCD_REF_PRIM_ID,
A.record_reference_primary_identification_type_code RCD_REF_PRIM_ID_TYPE_CD,
A.record_reference_secondary_identification RCD_REF_SCND_ID,
A.record_reference_secondary_identification_type_code RCD_REF_SCND_ID_TYPE_CD,
A.subsidiary_period_start_year SBSD_PRD_STRT_YR,
A.state_fsa_code ST_FSA_CD,
A.A_identifier UCHG_ATRB_PRPS_PYMT_LOG_ID
FROM <V_SRC_SCHM_NM>.A (Driving Table)
left join <V_SRC_SCHM_NM>.B
on (A.previous_B_identifier =B.B_identifier)
Left Join
(SELECT *
FROM
(SELECT *
,row_number() over (partition by payment_attribution_identifier order by last_change_date desc) as rnk
FROM
(SELECT
payment_attribution_identifier,
subsidiary_customer_identifier,
paid_subsidiary_customer_identifier,
B_identifier,
last_change_date
FROM <V_SRC_SCHM_NM>.C where parent_payment_attribution_identifier is NULL
UNION
SELECT
payment_attribution_identifier,
subsidiary_customer_identifier,
paid_subsidiary_customer_identifier,
B_identifier,
last_change_date
FROM <V_SRC_SCHM_NM>.D where parent_payment_attribution_identifier is NULL
) attr_union
)attr_all
WHERE 1 = 1
AND rnk = 1
) ALIAS
on (B.B_identifier =ALIAS.B_identifier)
LEFT JOIN <V_SRC_SCHM_NM>.F
ON (ALIAS.paid_subsidiary_customer_identifier = F_customer_identifier)
/*get paid customer fsa state county information*/
LEFT JOIN <V_SRC_SCHM_NMv.G
ON (F_identifier = G_county_identifier)I Just changed the tables names,
A is the driving table which has 763 Mill Records
B is the Joining Table which has 104 MillionI am trying batch load this table so i am selecting 50 million from the driving table I cant restrict the records in Joins and the joining tables are huge as well and especially below part of the query is killing the performance
SELECT *
FROM
(SELECT *
,row_number() over (partition by payment_attribution_identifier order by last_change_date desc) as rnk
FROM
(SELECT
payment_attribution_identifier,
subsidiary_customer_identifier,
paid_subsidiary_customer_identifier,
B_identifier,
last_change_date
FROM <V_SRC_SCHM_NM>.C where parent_payment_attribution_identifier is NULL
UNION
SELECT
payment_attribution_identifier,
subsidiary_customer_identifier,
paid_subsidiary_customer_identifier,
B_identifier,
last_change_date
FROM <V_SRC_SCHM_NM>.D where parent_payment_attribution_identifier is NULL
) attr_union
)attr_all
WHERE 1 = 1
AND rnk = 1
) ALIAS
on (B.B_identifier =ALIAS.B_identifier)Table where i am doing union and later Rank has 90 Mill and 300 Mill respectively
Any suggestion son tuning are appreciated, Thank you.
Start with creating a temp table to hold the most recent "last_change_date" for each payment_attribution_identifier, once for each of the two different tables, and then join to that temp table on those 2 columns within the union portion of the query, and see if that helps.
Steve,
Thank you for your suggestion, I followed following steps based on your suggestion,
--CREATE temp Tables
select payment_attribution_identifier, max(last_change_date) as last_change_date into #PAI from
subsidiary_DP.dbo.payment_attribution group by payment_attribution_identifierselect payment_attribution_identifier, max(last_change_date) as last_change_date into #PAIA from
subsidiary_DP.dbo.payment_attribution_audit group by payment_attribution_identifier
-- UNION TEMP TablesSELECT payment_attribution_identifier, last_change_date from into # PAI_FINAL(
select payment_attribution_identifier, max(last_change_date) as last_change_date into #PAI from
<Table A> group by payment_attribution_identifier
UNION
select payment_attribution_identifier, max(last_change_date) as last_change_date into #PAIA from
<Table B> group by payment_attribution_identifier)
group by payment_attribution_identifier-- Join Temp Table to get otjer columns from PA table
SELECT
PA.payment_attribution_identifier,
PA.subsidiary_customer_identifier,
PA.paid_subsidiary_customer_identifier,
PA.proposed_payment_identifier,
PA.last_change_date INTO #PAI2 FROM
#PAI T1
JOIN
<Table A> PA ON
T1.payment_attribution_identifier=PA.payment_attribution_identifier
AND parent_payment_attribution_identifier is NULLSELECT * FROm #PAI2
-- Join Temp Table to get other columns from PAA table
SELECT
T1.payment_attribution_identifier,
PA.subsidiary_customer_identifier,
PA.paid_subsidiary_customer_identifier,
PA.proposed_payment_identifier,
T1.last_change_date INTO #PAI3 FROM
#PAIA T1
LEFT JOIN
<Table B> PA ON
T1.payment_attribution_identifier=PA.payment_attribution_identifier
AND T1.last_change_date=PA.last_change_date
AND parent_payment_attribution_identifier is NULL --(32135783 row(s) affected)-- UNIOn Temp Tables,
Select * INTo #PAI_FINAL from
(SELECT *,row_number() over (partition by payment_attribution_identifier order by last_change_date desc) as rnk
FROM (
SELECT * FROM #PAI2
UNION
SELECT * FROM #PAI3
) PA_ALL
) PA_ALL_R WHERE rnk = 1This entire process completed in 15-20 min, can you suggest how we can put this into a single query so that i can use it in the main query ? Thank you
What I was suggesting encompasses the portion of your code that I made bold and italicized with a yellow background. You then use the rest of your query, and in each of the two unioned pieces of your original query, you join to the appropriate temp table using an INNER JOIN. No need to try and UNION the two temp tables, as that just goes further down the rabbit hole, and introduces another possibility - that you get two different max date values for the same identifier. You may be best keeping those entirely separated, but without complete knowledge of your data, it's impossible to know for sure. And do follow Lynn's advice. One query is not always a good idea, and should not be any kind of guiding principal. If need be, a stored procedure can always be created to overcome any limitations in some kind of tool that requires a single query. It's quite often best to break down large problems into smaller pieces that can more easily be handled in smaller chunks.
Steve (aka sgmunson) 🙂 🙂 🙂
Rent Servers for Income (picks and shovels strategy)
Viewing 5 posts - 1 through 4 (of 4 total)
You must be logged in to reply to this topic. Login to reply