How will you optimize this Query (having many SELF (LEFT) Joins) ?

  • Dear All,

    I have got a small query for optimization and I need your help in it.

    It is taking 70-80 sec. on production server.

    Table structure, Query and Execution plan is attached for your reference.

    Any help will be appreciated.

    Thanks.

    QUERY:

    SELECT

    comp.comp_id AS comp_id, parent_comp.comp_code AS parent_comp_code, parent_comp.name AS parent_comp_name,

    grand_parent_comp.comp_id AS grand_parent_comp_id, grand_parent_comp.comp_code AS grand_parent_comp_code,

    grand_parent_comp.name AS grand_parent_comp_name, level0_comp.comp_id AS level0_comp_id,

    level0_comp.comp_code AS level0_comp_code, level0_comp.name AS level0_comp_name, level1_comp.comp_id AS level1_comp_id,

    level1_comp.comp_code AS level1_comp_code, level1_comp.name AS level1_comp_name, level2comp.comp_id AS second_level_comp_id,

    level2comp.comp_code AS second_level_comp_code, level2comp.name AS second_level_comp_name

    from

    dbo.comp as comp

    left join dbo.comp as parent_comp on comp.parent_comp_id = parent_comp.comp_id

    left join dbo.comp as grand_parent_comp on parent_comp.parent_comp_id = grand_parent_comp.comp_id

    left join dbo.comp as level0_comp on grand_parent_comp.parent_comp_id = level0_comp.comp_id

    left join dbo.comp as level1_comp on level0_comp.parent_comp_id = level1_comp.comp_id

    left join dbo.comp as level2comp

    on level2comp.comp_id =

    (select top 1

    lvl2_comp.comp_id

    from

    dbo.comp lvl2_comp, dbo.comp top_comp, dbo.comp_chain bc

    where

    isnull(top_comp.parent_comp_id, '0') ='0' and

    top_comp.company_code = comp.company_code and

    lvl2_comp.parent_comp_id = top_comp.comp_id and

    bc.comp_id = lvl2_comp.comp_id and

    bc.chain_comp_id = comp.comp_id

    order by lvl2_comp.comp_id

    )

    IO STATISTICS

    (89090 row(s) affected)

    Table 'comp'. Scan count 1, logical reads 637983, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    Table 'comp_chain'. Scan count 89090, logical reads 268582, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    (1 row(s) affected)

  • -- Some reformatting to make the code easier to read

    -- and a small change to make the logic a little easier to follow.

    -- Please check that this query returns the same results as the original.

    -- Please post some sample data. If you're not sure of how to do this,

    -- there's an article linked in my sig. The sample data should contain at

    -- least one set which has members in all 6 level;s of the hierarchy.

    SELECT

    L0.comp_id AS comp_id,

    L1.comp_code AS parent_comp_code,

    L1.name AS parent_comp_name,

    L2.comp_id AS grand_parent_comp_id,

    L2.comp_code AS grand_parent_comp_code,

    L2.name AS grand_parent_comp_name,

    L3.comp_id AS level0_comp_id,

    L3.comp_code AS level0_comp_code,

    L3.name AS level0_comp_name,

    L4.comp_id AS level1_comp_id,

    L4.comp_code AS level1_comp_code,

    L4.name AS level1_comp_name,

    L5.comp_id AS second_level_comp_id,

    L5.comp_code AS second_level_comp_code,

    L5.name AS second_level_comp_name

    FROM dbo.comp L0

    LEFT JOIN dbo.comp L1 ON L1.comp_id = L0.parent_comp_id

    LEFT JOIN dbo.comp L2 ON L2.comp_id = L1.parent_comp_id

    LEFT JOIN dbo.comp L3 ON L3.comp_id = L2.parent_comp_id

    LEFT JOIN dbo.comp L4 ON L4.comp_id = L3.parent_comp_id

    --LEFT JOIN dbo.comp L5 ON L5.comp_id = one less table reference

    OUTER APPLY (

    SELECT TOP 1

    lvl2_comp.comp_id,

    lvl2_comp.comp_code,

    lvl2_comp.name

    FROM dbo.comp lvl2_comp

    INNER JOIN dbo.comp top_comp

    ON lvl2_comp.parent_comp_id = top_comp.comp_id

    AND top_comp.company_code = L0.company_code -- outer reference

    INNER JOIN dbo.comp_chain bc

    ON bc.comp_id = lvl2_comp.comp_id

    AND bc.chain_comp_id = L0.comp_id -- outer reference

    WHERE top_comp.parent_comp_id IS NULL OR top_comp.parent_comp_id = '0'

    ORDER BY lvl2_comp.comp_id

    ) L5

    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • Thanks for the reply Chris,

    I have already posted some sample data in my earlier post. Due to some legal restrictions, I can't post more data.

    I have executed the code and results are as below:

    1. As you Optimized

    (89090 row(s) affected)

    Table 'comp'. Scan count 199405, logical reads 907812, physical reads 83, read-ahead reads 873, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    Table 'comp_chain'. Scan count 89090, logical reads 268582, physical reads 445, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    2. After changing a line in your code

    --WHERE top_comp.parent_comp_id IS NULL OR top_comp.parent_comp_id = '0'

    where isnull(top_comp.parent_comp_id, '0') ='0'

    (89090 row(s) affected)

    Table 'comp'. Scan count 1, logical reads 604512, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    Table 'comp_chain'. Scan count 89090, logical reads 268582, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    (1 row(s) affected)

    Original IO statistics

    Table 'comp'. Scan count 1, logical reads 637983

    New IO statistics

    Table 'comp'. Scan count 1, logical reads 604512

  • theashishtrivedi (4/2/2013)


    Thanks for the reply Chris,

    I have already posted some sample data in my earlier post. Due to some legal restrictions, I can't post more data.

    Can you please read the article I mentioned? It will show you how to set up your sample data as a set of INSERTs, so folks don't have to spend half an hour or more prepping data before they can begin to investigate the problem.

    Many of us deal with sensitive data every day so I understand your comment, however to fully test any solution, we will need sample data which touches every part of the query. If necessary you could mask real values with dummy values. Numeric keys are fine, the 'name' values might require masking.

    Did the query I posted return the expected result set? I'm not concerned yet with performance, I'm toying with it to construct something which is easier to work with.

    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • Chris,

    I have attached sample data for both the tables.

  • Hierarchical data is very difficult in relational structures. If you can refactor the table you might get a win from using the built-in hierarchy datatype, which is CLR based under the covers. I would recommend checking out stuff written by Itzik Ben-Gan too. He has done a number of articles (and book topics I think) on the topic of hierarchies. Check SQLMag.com. There is a neat trick you can do with binary casting that can sometimes get things done without bajillions of reads/iterations.

    Best,
    Kevin G. Boles
    SQL Server Consultant
    SQL MVP 2007-2012
    TheSQLGuru on googles mail service

  • T.Ashish (4/2/2013)


    Chris,

    I have attached sample data for both the tables.

    Thanks, it's so much easier when you have sample data to play with.

    As Kevin pointed out, this isn't the most efficient type of hierarchy to return results from, and if you have the resources you should consider the alternatives. If not, then a couple of new indexes should give you a decent performance lift:

    -- Change column parent_comp_id to be not nullable

    ALTER TABLE comp ALTER COLUMN parent_comp_id VARCHAR(24) NOT NULL

    -- Two new indexes:

    CREATE UNIQUE NONCLUSTERED INDEX [ix_HierarchyA] ON [dbo].[comp]

    (parent_comp_id ASC, comp_id ASC)

    INCLUDE

    (comp_code, [name], company_code)

    CREATE UNIQUE NONCLUSTERED INDEX [ix_HierarchyB] ON [dbo].[comp]

    (comp_id ASC, parent_comp_id ASC)

    INCLUDE

    (comp_code, [name], company_code)

    The subquery for returning the level5 details looks like trial and error coding and is difficult to make sense of:

    SELECT TOP 1

    L.comp_id,

    L.comp_code,

    L.name

    FROM dbo.comp L

    INNER JOIN dbo.comp tc

    ON tc.comp_id = L.parent_comp_id

    AND tc.company_code = L0.company_code -- outer reference

    INNER JOIN dbo.comp_chain bc

    ON bc.comp_id = L.comp_id

    AND bc.chain_comp_id = L0.comp_id -- outer reference

    WHERE tc.parent_comp_id IS NULL OR tc.parent_comp_id = '0'

    ORDER BY L.comp_id

    Can you explain what it's meant to do?

    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

Viewing 7 posts - 1 through 6 (of 6 total)

You must be logged in to reply to this topic. Login to reply