Correlated sub query takes more time to return result

  • Correlated sub query takes more time to return result...Please help in spiltting this into join...

    select * from xxxxflow f where

    (mstr_ordid) = (select max(mstr_ordid) from xxxxflow b

    where b.Ord_Num = f.Ord_Num

    and convert(varchar(10),b.date,101) = convert(varchar(10),f.date_time,101)

    )

    Thanks in advance

    Raghu

  • Your biggest problem is that you are using Functions against columns in the where of the subquery which will cause a table scan. You should first look at how you can eliminate that. Can you give us the definition of the table, some test data, and what you are trying to accomplish with the query? Can you convert it to a stored procedure where we can use table variables/temp tables?

    This may be what you want:

    [font="Courier New"]SELECT

        F.*

    FROM

        xxxxflow F JOIN

        (SELECT

            MAX(mstr_ordid) AS max_ord_id,

            Ord_Num,

             CONVERT(VARCHAR(10),b.date,101) AS date_string

        FROM

             xxxflow) B ON

            f.Ord_Num = B.Ord_Num AND

            CONVERT(VARCHAR(10),f.date_time,101) = B.date_string AND

            F.mstr_ordid = B.max_ord_id

    [/font]

  • Hi Thank you for teh update.But still O am facing the same problem.

    My main table XXX_FLOW has : 125638737 records in it and

    My SP : Billing_Completed_Flow access a view Vw_XXX_FLOW created using the following correlated sub query:

    create view Vw_XXX_FLOW as

    select * from XXX_FLOW sf2 where

    (mstr_ord_id) = (select max(mstr_ord_id) from XXX_FLOW b

    where b.Ord_Num = sf2.Ord_Num

    and convert(varchar(10),b.Inserted_time,101) = convert(varchar(10),sf2.Inserted_time,101)

    due to the above view, my SP has taken 23 hours to complete.

    Please suggest me the way to tune this query.

    Thnaks in advance....

    Raghu

  • Are you converting the dates because you want to match on the date, but the time is immaterial? If so, check out this thread for some ideas on changing the date compare to look at a range:

    http://www.sqlservercentral.com/Forums/Topic529603-8-1.aspx#bm529668


    And then again, I might be wrong ...
    David Webb

  • It was said once by Jack

    Check out these links on how to get faster and more accurate answers:

    Forum Etiquette: How to post data/code on a forum to get the best help

    Need an Answer? Actually, No ... You Need a Question

    Suggest again ... read it and then post your question

    If everything seems to be going well, you have obviously overlooked something.

    Ron

    Please help us, help you -before posting a question please read[/url]
    Before posting a performance problem please read[/url]

  • g.raghunathan (7/29/2008)


    Hi Thank you for teh update.But still O am facing the same problem.

    My main table XXX_FLOW has : 125638737 records in it and

    My SP : Billing_Completed_Flow access a view Vw_XXX_FLOW created using the following correlated sub query:

    create view Vw_XXX_FLOW as

    select * from XXX_FLOW sf2 where

    (mstr_ord_id) = (select max(mstr_ord_id) from XXX_FLOW b

    where b.Ord_Num = sf2.Ord_Num

    and convert(varchar(10),b.Inserted_time,101) = convert(varchar(10),sf2.Inserted_time,101)

    due to the above view, my SP has taken 23 hours to complete.

    Please suggest me the way to tune this query.

    Thnaks in advance....

    Raghu

    First of all that is a HORRIBLE view. You need to get rid of the select * first. Then as David said try to find a way to get rid of the conversion on the dates. The QP is likely to ignore any indexes because of these 2 things. Then try to find a way to limit the data in the subquery. You are giving it no filters it HAS to scan every row.

    If you can explain what you need for a result with some table schemas and some test data we will probably be able to find a better way.

    Did you try the query I provided earlier and did it perform any better?

  • Jack Corbett (7/25/2008)


    Your biggest problem is that you are using Functions against columns in the where of the subquery which will cause a table scan. You should first look at how you can eliminate that. Can you give us the definition of the table, some test data, and what you are trying to accomplish with the query? Can you convert it to a stored procedure where we can use table variables/temp tables?

    This may be what you want:

    [font="Courier New"]SELECT

        F.*

    FROM

        xxxxflow F JOIN

        (SELECT

            MAX(mstr_ordid) AS max_ord_id,

            Ord_Num,

             CONVERT(VARCHAR(10),b.date,101) AS date_string

        FROM

             xxxflow) B ON

            f.Ord_Num = B.Ord_Num AND

            CONVERT(VARCHAR(10),f.date_time,101) = B.date_string AND

            F.mstr_ordid = B.max_ord_id

    [/font]

    Jack - so the function against columns is causing the table scan in the WHERE, but not in the JOIN in your re-write? Does SQL not have to scan to evaluate the date from F to compare to B.date_string?

    ---------------------------------------------------------
    How best to post your question[/url]
    How to post performance problems[/url]
    Tally Table:What it is and how it replaces a loop[/url]

    "stewsterl 80804 (10/16/2009)I guess when you stop and try to understand the solution provided you not only learn, but save yourself some headaches when you need to make any slight changes."

  • Sure it is still an issue in my query, but by using the join and doing the conversion on 1 side in the Derived table I reduce it to 1/2 the equation. Actually, now that I re-examine my code, it won't work as posted anyway because I have no group by in the derived table. You would need to add:

    Group By

    ord_num,

    CONVERT(VARCHAR(10),b.date,101)

    to the derived table.

    Without seeing the table definitions and having more information about the desired results with some test data makes it hard to give a better answer.

  • I think it would help to know more about the the table structure (at least the fields involved in the join/where clauses), some sample data to test against, and an expected results based on the sample data (to check tests against).

    it looks like a self join using a correlated subquery. Knowing more about the data will help in figuring out how to join things together.

    😎

  • Thanks Jack, just wanted to make sure I understood how the use of functions was limiting the performance correctly.

    ---------------------------------------------------------
    How best to post your question[/url]
    How to post performance problems[/url]
    Tally Table:What it is and how it replaces a loop[/url]

    "stewsterl 80804 (10/16/2009)I guess when you stop and try to understand the solution provided you not only learn, but save yourself some headaches when you need to make any slight changes."

  • select * from xxxxflow f where

    (mstr_ordid) = (select max(mstr_ordid) from xxxxflow b

    where b.Ord_Num = f.Ord_Num

    and convert(varchar(10),b.date,101) = convert(varchar(10),f.date_time,101)

    )

    some way to improve performance,

    select (specify column name) from xxxxflow f where

    (mstr_ordid) =

    (

    select max(mstr_ordid) from xxxxflow b

    where b.Ord_Num = f.Ord_Num

    and convert(varchar(10),b.date,101) = convert(varchar(10),f.date_time,101) -- Is it possible to convert into integer

    -- Because if you convert into varchar, you sql will read the character one by one and its little bit performance issue.

    )

    Regards,

    Venkatesan Prabu .J

    http://venkattechnicalblog.blogspot.com/

    Thanks and Regards,
    Venkatesan Prabu, 😛
    My Blog:

    http://venkattechnicalblog.blogspot.com/

  • This also works for me and seems slightly faster than the convert to varchar.

    select f.*

    from xxxxflow f

    inner join (select

    MAX(mstr_ordid) as mstr_ordid,

    Ord_Num ,

    convert(datetime,(floor(convert(float,date)))) as d1,

    convert(datetime,(ceiling(convert(float,date)))) as d2

    from xxxxflow f

    group by

    Ord_Num ,

    convert(datetime,(floor(convert(float,date)))) ,

    convert(datetime,(ceiling(convert(float,date))))

    ) b on b.Ord_Num = f.Ord_Num

    and b.mstr_ordid = f.mstr_ordid

    and d1 <= f.date_time

    and d2 > f.date_time

    order by f.Ord_Num

    I also created thiss index:

    CREATE INDEX ix_xxxxflow_index2 ON xxxxflow (mstr_ordid, Ord_Num,date)

    I tested the following on approx 3.5 million rows of data:

    Runtime original query: I stopped it after 12 minutes

    Runtime convert to varchar: 45 seconds

    Runtime floor/ceiling: 25 seconds

Viewing 12 posts - 1 through 11 (of 11 total)

You must be logged in to reply to this topic. Login to reply