Query running very slow

  • Hi,

    I need to re-write a part of the SQL query to make it run fast. I need help to replace the MAX function which I think is making the query very slow.

    LEFT JOIN

    (

    SELECT distinct

    BL_ID,

    MAX(CASE WHEN COMPANY_ROLE_CD = 'SH' THEN NAME ELSE NULL END) AS [SHIPPER],

    MAX(CASE WHEN COMPANY_ROLE_CD = 'CN' THEN NAME ELSE NULL END) AS [CONSIGNEE],

    MAX(CASE WHEN COMPANY_ROLE_CD = 'NP' THEN NAME ELSE NULL END) AS [NOTIFY],

    MAX(CASE WHEN COMPANY_ROLE_CD = 'SH' THEN DUNS ELSE NULL END) AS [SHIPPER_DUNS],

    MAX(CASE WHEN COMPANY_ROLE_CD = 'CN' THEN DUNS ELSE NULL END) AS [CONSIGNEE_DUNS],

    MAX(CASE WHEN COMPANY_ROLE_CD = 'NP' THEN DUNS ELSE NULL END) AS [NOTIFY_DUNS],

    MAX(CASE WHEN COMPANY_ROLE_CD = 'SH' THEN COMPANY_NAME ELSE NULL END) AS [SHIPPER_CUSTNAME],

    MAX(CASE WHEN COMPANY_ROLE_CD = 'CN' THEN COMPANY_NAME ELSE NULL END) AS [CONSIGNEE_CUSTNAME],

    MAX(CASE WHEN COMPANY_ROLE_CD = 'NP' THEN COMPANY_NAME ELSE NULL END) AS [NOTIFY_CUSTNAME]

    FROM MG_BL_PARTY

    join MG_COMPANY

    on MG_BL_PARTY.COMPANY_ID = MG_COMPANY.COMPANY_ID

    WHERE COMPANY_ROLE_CD IN ('SH','CN','NP')

    AND ORDER_SEQ_NBR = 1

    GROUP BY BL_ID

    )MGP

    ON MG_BILL_OF_LADING.BL_ID = MGP.BL_ID

    Please find the SQL query below:

    SELECT DISTINCT

    MG_BILL_OF_LADING.BL_NUM

    ,MG_BL_CONTAINER.CONTAINER_NUM

    ,MG_BILL_OF_LADING.POR_LOCATION_CD

    ,MG_BILL_OF_LADING.POL_LOCATION_CD

    ,MG_BILL_OF_LADING.TRANSHIPMENT_PORT_CD

    ,MG_BILL_OF_LADING.POD_LOCATION_CD

    ,MG_BILL_OF_LADING.PDL_LOCATION_CD

    ,MG_BILL_OF_LADING.POI_LOCATION_CD

    ,MG_BILL_OF_LADING.BL_OK_FLG

    ,MG_BILL_OF_LADING.DELETED_FLG

    ,MG_BILL_OF_LADING.DELETED_DT

    ,MG_BILL_OF_LADING.CREATE_DT

    ,MG_BILL_OF_LADING.SC_NUM

    ,MG_BILL_OF_LADING.TRADE_CD

    ,SHIPPER

    ,CONSIGNEE

    ,NOTIFY

    ,SHIPPER_DUNS

    ,CONSIGNEE_DUNS

    ,NOTIFY_DUNS

    ,SHIPPER_CUSTNAME

    ,CONSIGNEE_CUSTNAME

    ,NOTIFY_CUSTNAME

    ,MG_BL_CONTAINER.CONTAINER_PREFIX

    ,MG_BL_CONTAINER.CONTAINER_CHECK

    ,MG_BL_CONTAINER.CONTAINER_TYPE_CD

    ,MG_BL_CONTAINER.CONTAINER_SIZE_CD

    ,MG_VESSEL_PARTICULAR.VESSEL_TYPE_CD

    ,MG_BL_ITINERARY.VESSEL_CD

    ,MG_BL_ITINERARY.VOYAGE_CD

    ,MG_BL_ITINERARY.LEG_CD

    ,MG_BL_ITINERARY.FROM_LOCATION_CD

    ,MG_BL_ITINERARY.ETD_DT

    ,MG_BL_ITINERARY.ETA_DT

    ,MG_BL_ITINERARY.TO_LOCATION_CD

    ,CASE WHEN MG_BL_CONTAINER.CONTAINER_SIZE_CD = '20' THEN 1 WHEN MG_BL_CONTAINER.CONTAINER_SIZE_CD IN ('40','45') THEN 2 END AS TEU,

    MG_BL_ITINERARY.FROM_FACILITY_CD,

    MG_BL_ITINERARY.TO_FACILITY_CD,

    --MG_VSLVOY_SCHEDULE.ARRIVAL_ACTUAL_DT,

    MG_BILL_OF_LADING.BL_TYPE_CD

    FROM

    MG_BILL_OF_LADING

    INNER JOIN

    MG_BL_CONTAINER

    ON MG_BILL_OF_LADING.BL_ID = MG_BL_CONTAINER.BL_ID

    LEFT JOIN

    (

    SELECT distinct

    BL_ID,

    MAX(CASE WHEN COMPANY_ROLE_CD = 'SH' THEN NAME ELSE NULL END) AS [SHIPPER],

    MAX(CASE WHEN COMPANY_ROLE_CD = 'CN' THEN NAME ELSE NULL END) AS [CONSIGNEE],

    MAX(CASE WHEN COMPANY_ROLE_CD = 'NP' THEN NAME ELSE NULL END) AS [NOTIFY],

    MAX(CASE WHEN COMPANY_ROLE_CD = 'SH' THEN DUNS ELSE NULL END) AS [SHIPPER_DUNS],

    MAX(CASE WHEN COMPANY_ROLE_CD = 'CN' THEN DUNS ELSE NULL END) AS [CONSIGNEE_DUNS],

    MAX(CASE WHEN COMPANY_ROLE_CD = 'NP' THEN DUNS ELSE NULL END) AS [NOTIFY_DUNS],

    MAX(CASE WHEN COMPANY_ROLE_CD = 'SH' THEN COMPANY_NAME ELSE NULL END) AS [SHIPPER_CUSTNAME],

    MAX(CASE WHEN COMPANY_ROLE_CD = 'CN' THEN COMPANY_NAME ELSE NULL END) AS [CONSIGNEE_CUSTNAME],

    MAX(CASE WHEN COMPANY_ROLE_CD = 'NP' THEN COMPANY_NAME ELSE NULL END) AS [NOTIFY_CUSTNAME]

    FROM MG_BL_PARTY

    join MG_COMPANY

    on MG_BL_PARTY.COMPANY_ID = MG_COMPANY.COMPANY_ID

    WHERE COMPANY_ROLE_CD IN ('SH','CN','NP')

    AND ORDER_SEQ_NBR = 1

    GROUP BY BL_ID

    )MGP

    ON MG_BILL_OF_LADING.BL_ID = MGP.BL_ID

    Could someone please help on this ? Thanks in advance for your help.

  • You should get rid of the DISTINCT in the subquery; since you're GROUPing on BL_ID, you don't need it anyway.

    Otherwise it's difficult to re-write because there are no aliases on the other columns, so I don't know which tables COMPANY_ROLE_CD, ORDER_SEQ_NBR, etc., are in.

    SQL DBA,SQL Server MVP(07, 08, 09) A socialist is someone who will give you the shirt off *someone else's* back.

  • Just out curiousity are those varchar fields you're using MAX on? While that will work is that really what you want?

  • ZZartin (3/25/2015)


    Just out curiousity are those varchar fields you're using MAX on? While that will work is that really what you want?

    Yes, those are all varchar fields. Is there some way to replace the MAX function ? Thanks.

  • ScottPletcher (3/25/2015)


    You should get rid of the DISTINCT in the subquery; since you're GROUPing on BL_ID, you don't need it anyway.

    Otherwise it's difficult to re-write because there are no aliases on the other columns, so I don't know which tables COMPANY_ROLE_CD, ORDER_SEQ_NBR, etc., are in.

    Thanks, I removed the distinct but the query is still slow. Any more ideas ? Thanks.

  • The first thing to looks for is execution plan , which tells alot about how optimizer is processing this . I would say please paste the Exec plan and start focusing on cost involved at different operator to optimize this .

    Have you checked number of reads and stats .

  • pwalter83 (3/25/2015)


    ZZartin (3/25/2015)


    Just out curiousity are those varchar fields you're using MAX on? While that will work is that really what you want?

    Yes, those are all varchar fields. Is there some way to replace the MAX function ? Thanks.

    The reason I ask is because unless those fields are specifically populated in such a way that they will always sort in such a way that the top value is what you always want MAX may not be giving you the data you want(or expect). Without knowing exactly how those fields are populated and what they mean it would be hard to offer an answer.

    As for performance, can you verify that the subquery is what is actually causing the performance issue? Run just the subquery and compare the run time of that to running the entire query.

  • pwalter83 (3/25/2015)


    ScottPletcher (3/25/2015)


    You should get rid of the DISTINCT in the subquery; since you're GROUPing on BL_ID, you don't need it anyway.

    Otherwise it's difficult to re-write because there are no aliases on the other columns, so I don't know which tables COMPANY_ROLE_CD, ORDER_SEQ_NBR, etc., are in.

    Thanks, I removed the distinct but the query is still slow. Any more ideas ? Thanks.

    Again, please alias ALL columns in the query, otherwise I can't tell which table each column comes from. You have to remember we don't know your data -- it's easy for you to know which column is from which table, but we've never seen the tables before, and so can't possibly know.

    SQL DBA,SQL Server MVP(07, 08, 09) A socialist is someone who will give you the shirt off *someone else's* back.

  • I'd really focus on the execution plan. I suspect the MAX itself isn't causing issues but rather the GROUP BY.

    One way around it would be to try getting a TOP 1 with an ORDER BY instead of a MAX & GROUP BY. That doesn't require aggregation so it can run faster.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • Grant Fritchey (3/25/2015)


    I'd really focus on the execution plan. I suspect the MAX itself isn't causing issues but rather the GROUP BY.

    One way around it would be to try getting a TOP 1 with an ORDER BY instead of a MAX & GROUP BY. That doesn't require aggregation so it can run faster.

    The MAX and the GROUP BY is a form of PIVOT known as a CROSS TAB and it's not the problem. The problem is that they're trying to join to a Pivot/CrossTab. The best thing to do here would be to do the Divide'n'Conquer thing and do the cross tab separately into a Temp table and then join on that temp table. Done correctly, I believe you'll see at least an order of magnitude in performance improvement.

    As a side bar, I'll never understand why people think they need to do things in a single query. That's NOT what set-based code is at all.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Please provide a description of what this query is attempting to accomplish. For example, if all you're trying to do is return the most recent row for each BL_ID, then that can be accomplished by apply a filer to a DENSE_RANK() or ROW_NUMBER() function rather than use of aggregation.

    "Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho

  • I'd also agree you're better off posting the execution plan rather than just removing elements of your query.

    Admittedly, the more experienced you are, the more likely you will be able to spot code smells but still...

Viewing 12 posts - 1 through 11 (of 11 total)

You must be logged in to reply to this topic. Login to reply