Complex join help - query taking too long to run

  • Hello,

    I'm hoping someone can help me with a performance issue I have with my query. I am doing a full join on 3 tables but the query takes hours to run & I have to end up killing it. Below is my query - any help would be greatly appreciated!

    select distinct *,

    position AS POSITION_DERIVED, CAMPAIGN_PERSON_STATUS AS PERSON_STATUS_DERIVED,CAMPAIGN_COMPANY_NAME AS COMPANY_NAME_DERIVED,

    CAMPAIGN_FIRST_NAME + '' + CAMPAIGN_LAST_NAME AS FULL_NAME_DERIVED

    into #all_date_joined

    from CAMPAIGNS c (nolock) full outer join EVENTS a (nolock) on c.campaign_mpam_activity_id=a.mpam_activity_id

    and (a.semr_uuid=c.CAMPAIGN_UUID or (a.SEMR_EMAIL_ADDRESS=c.CAMPAIGN_EMAIL_ADDRESS and A.SEMR_EMAIL_ADDRESS<>'idevents') or a.SEMR_CONTACT_ID=C.CAMPAIGN_CONTACT_ID OR

    (A.SEMR_FIRST_NAME=C.CAMPAIGN_FIRST_NAME AND A.SEMR_LAST_NAME=C.CAMPAIGN_LAST_NAME AND A.SEMR_COMPANY_NAME=C.CAMPAIGN_COMPANY_NAME))

    full outer join LEADS b (nolock) on a.MPAM_BGET_ID=b.PLDS_BGET_ID

    and (a.SEMR_CONTACT_ID=b.CONTACT_ID or (a.SEMR_EMAIL_ADDRESS=b.PLDS_EMAIL_ADDRESS and a.SEMR_EMAIL_ADDRESS<>'idevents') or (a.SEMR_FIRST_NAME=B.PLDS_FIRST_NAME

    and a.SEMR_LAST_NAME=B.PLDS_LAST_NAME and A.SEMR_COMPANY_NAME=B.PLDS_COMPANY_NAME))

  • maysoonshahin:

    Couple suggestions. Let me know either way if anything isn't clear.

    # 1) Please don't use the "select (*) " in your code. Only select the "required" fields you'll be using.

    # 2) It would be a good idea to create individual (temp tables) for each query set

    a. (CAMPAIGNS & EVENTS) >> Query 1

    b. (CAMPAIGNS & LEADS) >> Query 2

    # 3) Separate your queries so that there are 2 queries at first. You can insert the output into a temp table or table variable.

    Use this in your "Join" statement (only)

    a. ** For your first Query: >> "On (c.campaign_mpam_activity_id = a.mpam_activity_id And a.semr_uuid = c.CAMPAIGN_UUID)

    b. ** For your second Query: >> "On a.MPAM_BGET_ID = b.PLDS_BGET_ID"

    c. ** You can include the other conditions in the "Where Clause" for each section.

    So More or less:

    DECLARE @CampaignsEvents TABLE ([yourfieldshere] varchar(100))

    INSERT INTO @CampaignsEvents

    (

    [yourfieldshere]

    )

    Select

    Distinct

    *

    , position As POSITION_DERIVED

    , CAMPAIGN_PERSON_STATUS As PERSON_STATUS_DERIVED

    , CAMPAIGN_COMPANY_NAME As COMPANY_NAME_DERIVED

    , CAMPAIGN_FIRST_NAME + '' + CAMPAIGN_LAST_NAME As FULL_NAME_DERIVED

    From

    CAMPAIGNS c WITH ( nolock )

    Full Outer Join EVENTS a WITH ( nolock )

    On (c.campaign_mpam_activity_id = a.mpam_activity_id And a.semr_uuid = c.CAMPAIGN_UUID )

    WHERE ( a.SEMR_EMAIL_ADDRESS = c.CAMPAIGN_EMAIL_ADDRESS And a.SEMR_EMAIL_ADDRESS <> 'idevents' )

    Or ( a.SEMR_CONTACT_ID = c.CAMPAIGN_CONTACT_ID)

    Or ( a.SEMR_FIRST_NAME = c.CAMPAIGN_FIRST_NAME And a.SEMR_LAST_NAME = c.CAMPAIGN_LAST_NAME And a.SEMR_COMPANY_NAME = c.CAMPAIGN_COMPANY_NAME )

    -- ************************************************************************************************************************

    DECLARE @CampaignsLeads TABLE ([yourfieldshere] varchar(100))

    INSERT INTO @CampaignsLeads

    (

    [yourfieldshere]

    )

    Select Distinct

    *

    , position As POSITION_DERIVED

    , CAMPAIGN_PERSON_STATUS As PERSON_STATUS_DERIVED

    , CAMPAIGN_COMPANY_NAME As COMPANY_NAME_DERIVED

    , CAMPAIGN_FIRST_NAME + '' + CAMPAIGN_LAST_NAME As FULL_NAME_DERIVED

    FROM CAMPAIGNS c WITH ( nolock )

    Full Outer Join LEADS b ( nolock )

    On ( a.MPAM_BGET_ID = b.PLDS_BGET_ID )

    WHERE ( a.SEMR_CONTACT_ID = b.CONTACT_ID )

    OR ( a.SEMR_EMAIL_ADDRESS = b.PLDS_EMAIL_ADDRESS And a.SEMR_EMAIL_ADDRESS <> 'idevents')

    OR ( a.SEMR_FIRST_NAME = b.PLDS_FIRST_NAME And a.SEMR_LAST_NAME = b.PLDS_LAST_NAME And a.SEMR_COMPANY_NAME = b.PLDS_COMPANY_NAME)

    -- ************************************************************************************************************************

    -- Finally Join @CampaignsEvents with @CampaignsLeads

  • Thanks for the quick response!

    I tried breaking them out into 2 separate queries (except, I do not have them as where conditions because I need all rows & columns from all tables). When I broke them out into 2 queries, the first query ran in about 10 minutes but when I joined the temp table from the first query to the LEADs table, it took hours to run & had to kill it.

    I would have expected the first query to be the problem since the CAMPAIGN table has 4M rows and the EVENTS table has 1.5M rows, while the LEADS table only has about 300k rows.

  • Okay have you tried inserting records into a temporary table on individual conditions yet?

    So to see if it's a particular condition that's causing the long wait separate them to see if you can pin-point the exact cause.

    Example:

    Keep this on your Join Condition >> a.SEMR_CONTACT_ID = b.CONTACT_ID

    Run this in your Where Clause only:

    Where Clause for Query 1.1 >> ( a.SEMR_EMAIL_ADDRESS = b.PLDS_EMAIL_ADDRESS And a.SEMR_EMAIL_ADDRESS <> 'idevents')

    Run this in your Where Clause only:

    Where Clause for Query 1.2 >> ( a.SEMR_FIRST_NAME = B.PLDS_FIRST_NAME And a.SEMR_LAST_NAME = B.PLDS_LAST_NAMEAnd a.SEMR_COMPANY_NAME = B.PLDS_COMPANY_NAME )

  • I would say look out for nulls.

    Also, if your result set has millions of rows, do temp tables and forget about table variables.

    Stay away from this unless you know you will only have one row.

    DECLARE @CampaignsLeads TABLE ([yourfieldshere] varchar(100))

    INSERT INTO @CampaignsLeads

    -- Finally Join @CampaignsEvents with @CampaignsLeads

    this will make your server do joins between millions and million of rows using one thread so your CPU will be low but the query will run worse than ever before because in 2008 the statistics is hard coded as 1 row so you will have loop join hell occurring between millions of rows.

    Seriously, forget about table variables when dealing with more than one row.

    Forever and ever till kingdom come.

    select distinct *,

    position AS POSITION_DERIVED,

    CAMPAIGN_PERSON_STATUS AS PERSON_STATUS_DERIVED,

    CAMPAIGN_COMPANY_NAME AS COMPANY_NAME_DERIVED,

    CAMPAIGN_FIRST_NAME + '' + CAMPAIGN_LAST_NAME AS FULL_NAME_DERIVED

    into #all_date_joined

    from CAMPAIGNS c (nolock)

    full outer join EVENTS a (nolock)

    on c.campaign_mpam_activity_id=a.mpam_activity_id

    and (a.semr_uuid=c.CAMPAIGN_UUID

    or (a.SEMR_EMAIL_ADDRESS=c.CAMPAIGN_EMAIL_ADDRESS

    and A.SEMR_EMAIL_ADDRESS<>'idevents'

    )

    or a.SEMR_CONTACT_ID=C.CAMPAIGN_CONTACT_ID

    OR (A.SEMR_FIRST_NAME=C.CAMPAIGN_FIRST_NAME

    AND A.SEMR_LAST_NAME=C.CAMPAIGN_LAST_NAME

    AND A.SEMR_COMPANY_NAME=C.CAMPAIGN_COMPANY_NAME

    )

    )

    full outer join LEADS b (nolock)

    on a.MPAM_BGET_ID=b.PLDS_BGET_ID

    and (a.SEMR_CONTACT_ID=b.CONTACT_ID

    or (a.SEMR_EMAIL_ADDRESS=b.PLDS_EMAIL_ADDRESS

    and a.SEMR_EMAIL_ADDRESS<>'idevents')

    or

    (a.SEMR_FIRST_NAME=B.PLDS_FIRST_NAME

    and a.SEMR_LAST_NAME=B.PLDS_LAST_NAME

    and A.SEMR_COMPANY_NAME=B.PLDS_COMPANY_NAME

    )

    )

    Catch-all queries done right [/url]
    Gail Shaw's Performance Blog[/url]

Viewing 5 posts - 1 through 4 (of 4 total)

You must be logged in to reply to this topic. Login to reply