Suggestions on how to improve this View that is using correlated subqueries

  • Hello everyone,

    Can anyone please suggest me on how can I avoid using correlated subqueries in the following create view statement:

    CREATE VIEW [dbo].[PDR_V_FUTURE_POLICY_VERS] AS

    SELECT pol.Source_System_Identifier, pol.BATCH_NUM, pol.ACCT_NO, pol.stat_header_id, pol.POLICY_NO, pol.POLICY_NBR_SEQ, pol.POLICY_ID, pol.VERS_CNTL_MAJ_MIN, pol.vers_data_maj_min,

    pol.DATA_SET_NBR, sts.cur_status_c AS attrb_data, pol.tran_eff_d policy_tran, sts.effective_d status_tran

    FROM dbo.PDR_POL pol,

    (SELECT t.acct_no, t.policy_id, t.cur_status_c, t.effective_d, t.process_dt

    FROM dbo.pol_policy_status_v AS t

    WHERE t.policy_status_v_id = (SELECT MAX(i.policy_status_v_id)

    FROM dbo.pol_policy_status_v AS i

    WHERE i.acct_no = t.acct_no

    AND i.policy_id = t.policy_id

    AND i.process_dt = (SELECT MAX(ii.process_dt)

    FROM dbo.pol_policy_status_v AS ii

    WHERE ii.acct_no = i.acct_no

    AND ii.policy_id = i.policy_id

    AND ii.effective_d = (SELECT MAX(iii.effective_d)

    FROM dbo.pol_policy_status_v AS iii

    WHERE iii.acct_no = ii.acct_no

    AND iii.policy_id = ii.policy_id)))) AS sts

    WHERE pol.acct_no = sts.acct_no

    AND pol.policy_id = sts.policy_id

    AND pol.data_set_nbr = (SELECT MAX(i.data_set_nbr)

    FROM dbo.pdr_pol AS i

    WHERE i.acct_no = pol.acct_no

    AND i.policy_id = pol.policy_id

    AND i.stat_tran_c not like '%CNCLPA')

    AND (((pol.stat_tran_c = 'CANCEL' OR pol.stat_tran_c = 'CANREWRITE') AND sts.cur_status_c = 'CANCEL')

    OR (pol.stat_tran_c in ('REINSTATE','REINWGAP') AND sts.cur_status_c <> 'CANCEL'

    AND pol.vers_cntl_maj_min = (SELECT MAX(i.vers_cntl_maj_min)

    FROM dbo.pdr_pol AS i

    WHERE i.acct_no = pol.acct_no

    AND i.policy_id = pol.policy_id

    AND i.data_set_nbr = pol.data_set_nbr)

    AND pol.vers_cntl_maj_min <> pol.vers_data_maj_min)

    OR (pol.stat_tran_c NOT IN ('REINSTATE','REINWGAP') AND sts.cur_status_c <> 'CANCEL'

    AND pol.vers_cntl_maj_min = (SELECT MAX(i.vers_cntl_maj_min)

    FROM dbo.pdr_pol AS i

    WHERE i.acct_no = pol.acct_no

    AND i.policy_id = pol.policy_id

    AND i.data_set_nbr = pol.data_set_nbr)

    and pol.vers_cntl_maj_min = pol.vers_data_maj_min));

    GO

    Any suggestions would be highly appreciated.

    Thanks

    T-SQL beginner

  • Can you provide the requirements of view? Something like:

    "Return the Policy information for active policies based on the most recent status of the policy."

    I know that isn't right, but it is a start based on what I'm seeing in the query. I also don't know enough about the schema of the database and the data to know relationship between columns. I wouldn't think that you'd have to navigate from max effective date to max process data to get the max policy status v id.

  • Thanks for your reply Jack.

    Basically the reason for creating this view is to get the latest version of policy since there are a lot of of policies in the database. My management has asked me if I can improve the query in a way that it is easily understandable by the person reading it. So any suggestion on that end would be great ...

  • T-SQL beginnger (7/16/2014)


    Thanks for your reply Jack.

    Basically the reason for creating this view is to get the latest version of policy since there are a lot of of policies in the database. My management has asked me if I can improve the query in a way that it is easily understandable by the person reading it. So any suggestion on that end would be great ...

    Okay, so what are the rules that determine the latest version of a policy? Where I work (at a company in the insurance business) the most recent version of a policy is the one with the most recent effective date not greater than today. In your case it looks like there are quite a few more rules than that.

    For readability alone I'd start by breaking out the sub-queries into CTE's because I find them easier to read. So I might start with something like this:

    WITH max_status_effective_d

    AS (

    SELECT

    iii.acct_no,

    iii.policy_id,

    MAX(iii.effective_d) AS effective_d

    FROM

    dbo.pol_policy_status_v AS iii

    GROUP BY

    iii.acct_no,

    iii.policy_id

    ),

    max_process_dt

    AS (

    SELECT

    ii.acct_no,

    ii.policy_id,

    MAX(ii.process_dt) AS process_dt

    FROM

    dbo.pol_policy_status_v AS ii

    WHERE

    EXISTS ( SELECT

    1

    FROM

    max_status_effective_d AS ED

    WHERE

    ED.acct_no = ii.acct_no AND

    ED.policy_id = ii.policy_id AND

    ED.effective_d = ii.effective_d )

    GROUP BY

    ii.acct_no,

    ii.policy_id

    ),

    max_policy_status_v_id

    AS (

    SELECT

    i.acct_no,

    i.policy_id,

    MAX(i.policy_status_v_id) AS policy_status_v_id

    FROM

    dbo.pol_policy_status_v AS i

    WHERE

    EXISTS ( SELECT

    1

    FROM

    max_process_dt AS ED

    WHERE

    ED.acct_no = i.acct_no AND

    ED.policy_id = i.policy_id AND

    ED.process_dt = i.process_dt )

    GROUP BY

    i.acct_no,

    i.policy_id

    ),

    policy_status

    AS (

    SELECT

    t.acct_no,

    t.policy_id,

    t.cur_status_c,

    t.effective_d,

    t.process_dt

    FROM

    dbo.pol_policy_status_v AS t

    WHERE

    EXISTS ( SELECT

    1

    FROM

    max_policy_status_v_id AS i

    WHERE

    i.acct_no = t.acct_no AND

    i.policy_id = t.policy_id AND

    i.policy_status_v_id = max_policy_status_v_id.policy_status_v_id )

    )

    SELECT

    pol.Source_System_Identifier,

    pol.BATCH_NUM,

    pol.ACCT_NO,

    pol.stat_header_id,

    pol.POLICY_NO,

    pol.POLICY_NBR_SEQ,

    pol.POLICY_ID,

    pol.VERS_CNTL_MAJ_MIN,

    pol.vers_data_maj_min,

    pol.DATA_SET_NBR,

    sts.cur_status_c AS attrb_data,

    pol.tran_eff_d policy_tran,

    sts.effective_d status_tran

    FROM

    dbo.PDR_POL pol

    JOIN policy_status AS PS

    ON pol.acct_no = PS.acct_no AND

    pol.policy_id = PS.policy_id

    WHERE

    pol.data_set_nbr = (

    SELECT

    MAX(i.data_set_nbr)

    FROM

    dbo.pdr_pol AS i

    WHERE

    i.acct_no = pol.acct_no AND

    i.policy_id = pol.policy_id AND

    i.stat_tran_c NOT LIKE '%CNCLPA'

    ) AND

    (

    (

    (

    pol.stat_tran_c = 'CANCEL' OR

    pol.stat_tran_c = 'CANREWRITE'

    ) AND

    PS.cur_status_c = 'CANCEL'

    ) OR

    (

    pol.stat_tran_c IN ('REINSTATE', 'REINWGAP') AND

    PS.cur_status_c <> 'CANCEL' AND

    pol.vers_cntl_maj_min = (

    SELECT

    MAX(i.vers_cntl_maj_min)

    FROM

    dbo.pdr_pol AS i

    WHERE

    i.acct_no = pol.acct_no AND

    i.policy_id = pol.policy_id AND

    i.data_set_nbr = pol.data_set_nbr

    ) AND

    pol.vers_cntl_maj_min <> pol.vers_data_maj_min

    ) OR

    (

    pol.stat_tran_c NOT IN ('REINSTATE', 'REINWGAP') AND

    PS.cur_status_c <> 'CANCEL' AND

    pol.vers_cntl_maj_min = (

    SELECT

    MAX(i.vers_cntl_maj_min)

    FROM

    dbo.pdr_pol AS i

    WHERE

    i.acct_no = pol.acct_no AND

    i.policy_id = pol.policy_id AND

    i.data_set_nbr = pol.data_set_nbr

    ) AND

    pol.vers_cntl_maj_min = pol.vers_data_maj_min

    )

    );

    I know that there is a lot more that could be done for readability and performance as it seems like the query hits the same tables over and over and that could be reduced.

    If you read links 1 & 3 in my signature and posted as suggested in them I think I or someone else could be even more help.

  • and another possible approach. ** although without test data I can't be sure. 🙂

    I'd also recommend coming up with another way to identify the data_set_nbr. That is what is really going to kill your performance.

    CREATE VIEW [dbo].[PDR_V_FUTURE_POLICY_VERS]

    AS

    BEGIN

    ;WITH

    sts

    AS (SELECT acct_no, policy_id, cur_status_c

    ,MAX(effective_d) OVER (PARTITION BY acct_no, policy_id) AS maxEffDate

    ,MAX(process_dt) OVER (PARTITION BY acct_no, policy_id) AS maxPrcDate

    ,MAX(policy_status_v_id) OVER (PARTITION BY acct_no, policy_id) AS maxStatus

    FROM dbo.pol_policy_status_v AS t)

    SELECT

    pol.Source_System_Identifier,

    pol.BATCH_NUM,

    pol.ACCT_NO,

    pol.stat_header_id,

    pol.POLICY_NO,

    pol.POLICY_NBR_SEQ,

    pol.POLICY_ID,

    pol.VERS_CNTL_MAJ_MIN,

    pol.vers_data_maj_min,

    pol.DATA_SET_NBR,

    sts.cur_status_c AS attrb_data,

    pol.tran_eff_d as policy_tran,

    sts.maxEffDate AS status_tran

    FROM dbo.PDR_POL pol

    INNER JOIN sts

    ON sts.acct_no = pol.acct_no

    AND sts.policy_id = pol.policy_id

    AND sts.maxPrcDate = pol.process_dt

    AND sts.maxEffDate = pol.effective_d

    AND sts.maxStatus = pol.policy_status_v_id

    WHERE

    AND

    pol.data_set_nbr = (SELECT MAX(i.data_set_nbr)

    FROM dbo.pdr_pol AS i

    WHERE i.acct_no = pol.acct_no

    AND i.policy_id = pol.policy_id

    AND i.stat_tran_c not like '%CNCLPA')

    AND (((pol.stat_tran_c = 'CANCEL' OR pol.stat_tran_c = 'CANREWRITE') AND sts.cur_status_c = 'CANCEL')

    OR (pol.stat_tran_c in ('REINSTATE','REINWGAP') AND sts.cur_status_c <> 'CANCEL'

    AND pol.vers_cntl_maj_min = (SELECT MAX(i.vers_cntl_maj_min)

    FROM dbo.pdr_pol AS i

    WHERE i.acct_no = pol.acct_no

    AND i.policy_id = pol.policy_id

    AND i.data_set_nbr = pol.data_set_nbr)

    AND pol.vers_cntl_maj_min <> pol.vers_data_maj_min)

    OR (pol.stat_tran_c NOT IN ('REINSTATE','REINWGAP') AND sts.cur_status_c <> 'CANCEL'

    AND pol.vers_cntl_maj_min = (SELECT MAX(i.vers_cntl_maj_min)

    FROM dbo.pdr_pol AS i

    WHERE i.acct_no = pol.acct_no

    AND i.policy_id = pol.policy_id

    AND i.data_set_nbr = pol.data_set_nbr)

    and pol.vers_cntl_maj_min = pol.vers_data_maj_min));

    END

    GO

    ______________________________________________________________________

    Personal Motto: Why push the envelope when you can just open it?

    If you follow the direction given HERE[/url] you'll likely increase the number and quality of responses you get to your question.

    Jason L. Selburg
  • Jason Selburg (7/16/2014)


    and another possible approach. ** although without test data I can't be sure. 🙂

    I'd also recommend coming up with another way to identify the data_set_nbr. That is what is really going to kill your performance.

    I agree with all that. I thought about doing the CTE you did, but I wasn't confident that it would be the same as the nested sub-queries (again, no test data)m, so I just went for something I find easier to read to start with.

    I also think leveraging ROW_NUMBER to get to the most recent status row may be faster. I've seen examples where using row_number instead of MAX to get a specific row has been more efficient.

  • Jack Corbett (7/16/2014)


    I also think leveraging ROW_NUMBER to get to the most recent status row may be faster. I've seen examples where using row_number instead of MAX to get a specific row has been more efficient.

    Funny, that was the way I wrote it first.

    ______________________________________________________________________

    Personal Motto: Why push the envelope when you can just open it?

    If you follow the direction given HERE[/url] you'll likely increase the number and quality of responses you get to your question.

    Jason L. Selburg
  • I made a few changes to give you an idea. They might be wrong, but hopefully you can modify them accordingly.

    WITH PStatus AS(

    SELECT t.acct_no

    ,t.policy_id

    ,t.cur_status_c

    ,t.effective_d

    ,t.process_dt

    ,ROW_NUMBER() OVER(PARTITION BY acct_no,

    policy_id

    ORDER BY effective_id DESC,

    process_dt DESC,

    policy_status_v_id DESC ) rn

    FROM dbo.pol_policy_status_v

    )

    SELECT pol.Source_System_Identifier

    ,pol.BATCH_NUM

    ,pol.ACCT_NO

    ,pol.stat_header_id

    ,pol.POLICY_NO

    ,pol.POLICY_NBR_SEQ

    ,pol.POLICY_ID

    ,pol.VERS_CNTL_MAJ_MIN

    ,pol.vers_data_maj_min

    ,pol.DATA_SET_NBR

    ,sts.cur_status_c AS attrb_data

    ,pol.tran_eff_d policy_tran

    ,sts.effective_d status_tran

    FROM dbo.PDR_POL pol

    JOIN PStatus AS sts ON pol.acct_no = sts.acct_no

    AND pol.policy_id = sts.policy_id

    WHERE sts.rn = 1 --The most recent from pol_policy_status_v (or something like that)

    AND pol.data_set_nbr = (

    SELECT MAX(i.data_set_nbr)

    FROM dbo.pdr_pol AS i

    WHERE i.acct_no = pol.acct_no

    AND i.policy_id = pol.policy_id

    AND i.stat_tran_c NOT LIKE '%CNCLPA'

    )

    AND (

    (

    pol.stat_tran_c IN( 'CANCEL', 'CANREWRITE')

    AND sts.cur_status_c = 'CANCEL'

    )

    OR (

    pol.stat_tran_c IN ('REINSTATE','REINWGAP')

    AND sts.cur_status_c <> 'CANCEL'

    AND pol.vers_cntl_maj_min = (

    SELECT MAX(i.vers_cntl_maj_min)

    FROM dbo.pdr_pol AS i

    WHERE i.acct_no = pol.acct_no

    AND i.policy_id = pol.policy_id

    AND i.data_set_nbr = pol.data_set_nbr

    )

    AND pol.vers_cntl_maj_min <> pol.vers_data_maj_min

    )

    OR (

    pol.stat_tran_c NOT IN ('REINSTATE','REINWGAP')

    AND sts.cur_status_c <> 'CANCEL'

    AND pol.vers_cntl_maj_min = (

    SELECT MAX(i.vers_cntl_maj_min)

    FROM dbo.pdr_pol AS i

    WHERE i.acct_no = pol.acct_no

    AND i.policy_id = pol.policy_id

    AND i.data_set_nbr = pol.data_set_nbr

    )

    AND pol.vers_cntl_maj_min = pol.vers_data_maj_min

    )

    );

    Luis C.
    General Disclaimer:
    Are you seriously taking the advice and code from someone from the internet without testing it? Do you at least understand it? Or can it easily kill your server?

    How to post data/code on a forum to get the best help: Option 1 / Option 2
  • Thanks a lot guys for your help. I actually joined this insurance company like a week ago and they have asked me to help improve this code.

    I understand that you guys don't have the test data which doesn't helps the cause.

    I actually tried running both the queries and for Jack's solution I get the following error:

    Msg 4104, Level 16, State 1, Line 73

    The multi-part identifier "max_policy_status_v_id.policy_status_v_id" could not be bound.

    whereas for Jason's solution I am getting these errors:

    Msg 207, Level 16, State 1, Line 26

    Invalid column name 'process_dt'.

    Msg 207, Level 16, State 1, Line 28

    Invalid column name 'policy_status_v_id'

    I will try to check the underlying tables and fix these errors to see if I get the same result set as with the original view and let you know.

    Thanks again for your, really appreciate it !

  • For the error in mine you just need to replace the max_policy_status_v_id to i because I aliased the CTE in the FROM clause.

  • I actually changed it to t and sts aliases to PS later in the code and it works fine.

    I'm getting the same result set as original query.

    Thanks

  • T-SQL beginnger (7/16/2014)


    I actually changed it to t and sts aliases to PS later in the code and it works fine.

    I'm getting the same result set as original query.

    Thanks

    Not saying that my approach is better, but you should always try different methods to see which performs better in your environment. ** just good practice IMHO.

    ______________________________________________________________________

    Personal Motto: Why push the envelope when you can just open it?

    If you follow the direction given HERE[/url] you'll likely increase the number and quality of responses you get to your question.

    Jason L. Selburg
  • Jason, I tried fixing the errors with your code and now I'm getting this error:

    Msg 206, Level 16, State 2, Line 1

    Operand type clash: date is incompatible with int

    which is weird but I'm still looking into it

  • Yeah, I will definitely try all three solutions and see which one is better.

  • Jason Selburg (7/16/2014)


    Not saying that my approach is better, but you should always try different methods to see which performs better in your environment. ** just good practice IMHO.

    I'm pretty confident your approach will out-perform what I did because I didn't even try to improve performance, just readability. I just wasn't confident enough that I'd get the logic even close to the same, which is why I had the questions in my post with the code.

Viewing 15 posts - 1 through 15 (of 26 total)

You must be logged in to reply to this topic. Login to reply