Correlated sub query takes more time to return result

Question

Correlated sub query takes more time to return result

g.raghunathan

SSC Enthusiast

Points: 173
More actions
July 25, 2008 at 1:39 pm

#191537

Correlated sub query takes more time to return result...Please help in spiltting this into join...
select * from xxxxflow f where
(mstr_ordid) = (select max(mstr_ordid) from xxxxflow b
where b.Ord_Num = f.Ord_Num
and convert(varchar(10),b.date,101) = convert(varchar(10),f.date_time,101)
)
Thanks in advance
Raghu

Viewing 12 posts - 1 through 11 (of 11 total)

You must be logged in to reply to this topic. Login to reply

Jack Corbett SSC Guru Points: 184393 More actions · Answer 1

Your biggest problem is that you are using Functions against columns in the where of the subquery which will cause a table scan. You should first look at how you can eliminate that. Can you give us the definition of the table, some test data, and what you are trying to accomplish with the query? Can you convert it to a stored procedure where we can use table variables/temp tables?

This may be what you want:

[font="Courier New"]SELECT

F.*

FROM

xxxxflow F JOIN

(SELECT

MAX(mstr_ordid) AS max_ord_id,

Ord_Num,

CONVERT(VARCHAR(10),b.date,101) AS date_string

FROM

xxxflow) B ON

f.Ord_Num = B.Ord_Num AND

CONVERT(VARCHAR(10),f.date_time,101) = B.date_string AND

F.mstr_ordid = B.max_ord_id

[/font]

Jack Corbett
Consultant - Straight Path Solutions
Check out these links on how to get faster and more accurate answers:
Forum Etiquette: How to post data/code on a forum to get the best help
Need an Answer? Actually, No ... You Need a Question

g.raghunathan SSC Enthusiast Points: 173 More actions · Answer 2

Hi Thank you for teh update.But still O am facing the same problem.

My main table XXX_FLOW has : 125638737 records in it and

My SP : Billing_Completed_Flow access a view Vw_XXX_FLOW created using the following correlated sub query:

create view Vw_XXX_FLOW as

select * from XXX_FLOW sf2 where

(mstr_ord_id) = (select max(mstr_ord_id) from XXX_FLOW b

where b.Ord_Num = sf2.Ord_Num

and convert(varchar(10),b.Inserted_time,101) = convert(varchar(10),sf2.Inserted_time,101)

due to the above view, my SP has taken 23 hours to complete.

Please suggest me the way to tune this query.

Thnaks in advance....

Raghu

David Webb-CDS SSCoach Points: 17398 More actions · Answer 3

Are you converting the dates because you want to match on the date, but the time is immaterial? If so, check out this thread for some ideas on changing the date compare to look at a range:

http://www.sqlservercentral.com/Forums/Topic529603-8-1.aspx#bm529668

And then again, I might be wrong ...
David Webb

Ron McCullough SSC Guru Points: 63877 More actions · Answer 4

It was said once by Jack

Check out these links on how to get faster and more accurate answers:
Forum Etiquette: How to post data/code on a forum to get the best help
Need an Answer? Actually, No ... You Need a Question

Suggest again ... read it and then post your question

If everything seems to be going well, you have obviously overlooked something.

Ron

Please help us, help you -before posting a question please read[/url]
Before posting a performance problem please read[/url]

Jack Corbett SSC Guru Points: 184393 More actions · Answer 5

g.raghunathan (7/29/2008)
Hi Thank you for teh update.But still O am facing the same problem.
My main table XXX_FLOW has : 125638737 records in it and
My SP : Billing_Completed_Flow access a view Vw_XXX_FLOW created using the following correlated sub query:
create view Vw_XXX_FLOW as
select * from XXX_FLOW sf2 where
(mstr_ord_id) = (select max(mstr_ord_id) from XXX_FLOW b
where b.Ord_Num = sf2.Ord_Num
and convert(varchar(10),b.Inserted_time,101) = convert(varchar(10),sf2.Inserted_time,101)
due to the above view, my SP has taken 23 hours to complete.
Please suggest me the way to tune this query.
Thnaks in advance....
Raghu

First of all that is a HORRIBLE view. You need to get rid of the select * first. Then as David said try to find a way to get rid of the conversion on the dates. The QP is likely to ignore any indexes because of these 2 things. Then try to find a way to limit the data in the subquery. You are giving it no filters it HAS to scan every row.

If you can explain what you need for a result with some table schemas and some test data we will probably be able to find a better way.

Did you try the query I provided earlier and did it perform any better?

Jack Corbett
Consultant - Straight Path Solutions
Check out these links on how to get faster and more accurate answers:
Forum Etiquette: How to post data/code on a forum to get the best help
Need an Answer? Actually, No ... You Need a Question

jcrawf02 SSC-Insane Points: 24198 More actions · Answer 6

Jack Corbett (7/25/2008)
Your biggest problem is that you are using Functions against columns in the where of the subquery which will cause a table scan. You should first look at how you can eliminate that. Can you give us the definition of the table, some test data, and what you are trying to accomplish with the query? Can you convert it to a stored procedure where we can use table variables/temp tables?
This may be what you want:
[font="Courier New"]SELECT
    F.*
FROM
    xxxxflow F JOIN
    (SELECT
        MAX(mstr_ordid) AS max_ord_id,
        Ord_Num,
         CONVERT(VARCHAR(10),b.date,101) AS date_string
    FROM
         xxxflow) B ON
        f.Ord_Num = B.Ord_Num AND
        CONVERT(VARCHAR(10),f.date_time,101) = B.date_string AND
        F.mstr_ordid = B.max_ord_id
[/font]

Jack - so the function against columns is causing the table scan in the WHERE, but not in the JOIN in your re-write? Does SQL not have to scan to evaluate the date from F to compare to B.date_string?

---------------------------------------------------------
How best to post your question[/url]
How to post performance problems[/url]
Tally Table:What it is and how it replaces a loop[/url]

"stewsterl 80804 (10/16/2009)I guess when you stop and try to understand the solution provided you not only learn, but save yourself some headaches when you need to make any slight changes."

Jack Corbett SSC Guru Points: 184393 More actions · Answer 7

Sure it is still an issue in my query, but by using the join and doing the conversion on 1 side in the Derived table I reduce it to 1/2 the equation. Actually, now that I re-examine my code, it won't work as posted anyway because I have no group by in the derived table. You would need to add:

Group By

ord_num,

CONVERT(VARCHAR(10),b.date,101)

to the derived table.

Without seeing the table definitions and having more information about the desired results with some test data makes it hard to give a better answer.

Jack Corbett
Consultant - Straight Path Solutions
Check out these links on how to get faster and more accurate answers:
Forum Etiquette: How to post data/code on a forum to get the best help
Need an Answer? Actually, No ... You Need a Question

Lynn Pettis SSC Guru Points: 442467 More actions · Answer 8

I think it would help to know more about the the table structure (at least the fields involved in the join/where clauses), some sample data to test against, and an expected results based on the sample data (to check tests against).

it looks like a self join using a correlated subquery. Knowing more about the data will help in figuring out how to join things together.

😎

jcrawf02 SSC-Insane Points: 24198 More actions · Answer 9

Thanks Jack, just wanted to make sure I understood how the use of functions was limiting the performance correctly.

---------------------------------------------------------
How best to post your question[/url]
How to post performance problems[/url]
Tally Table:What it is and how it replaces a loop[/url]

"stewsterl 80804 (10/16/2009)I guess when you stop and try to understand the solution provided you not only learn, but save yourself some headaches when you need to make any slight changes."

Venkatesan Prabu SSCommitted Points: 1892 More actions · Answer 10

select * from xxxxflow f where

(mstr_ordid) = (select max(mstr_ordid) from xxxxflow b

where b.Ord_Num = f.Ord_Num

and convert(varchar(10),b.date,101) = convert(varchar(10),f.date_time,101)

)

some way to improve performance,

select (specify column name) from xxxxflow f where

(mstr_ordid) =

(

select max(mstr_ordid) from xxxxflow b

where b.Ord_Num = f.Ord_Num

and convert(varchar(10),b.date,101) = convert(varchar(10),f.date_time,101) -- Is it possible to convert into integer

-- Because if you convert into varchar, you sql will read the character one by one and its little bit performance issue.

)

Regards,

Venkatesan Prabu .J

http://venkattechnicalblog.blogspot.com/

Thanks and Regards,
Venkatesan Prabu, 😛
My Blog:

http://venkattechnicalblog.blogspot.com/

tertiusdp SSCommitted Points: 1569 More actions · Answer 11

This also works for me and seems slightly faster than the convert to varchar.

select f.*

from xxxxflow f

inner join (select

MAX(mstr_ordid) as mstr_ordid,

Ord_Num ,

convert(datetime,(floor(convert(float,date)))) as d1,

convert(datetime,(ceiling(convert(float,date)))) as d2

from xxxxflow f

group by

Ord_Num ,

convert(datetime,(floor(convert(float,date)))) ,

convert(datetime,(ceiling(convert(float,date))))

) b on b.Ord_Num = f.Ord_Num

and b.mstr_ordid = f.mstr_ordid

and d1 <= f.date_time

and d2 > f.date_time

order by f.Ord_Num

I also created thiss index:

CREATE INDEX ix_xxxxflow_index2 ON xxxxflow (mstr_ordid, Ord_Num,date)

I tested the following on approx 3.5 million rows of data:

Runtime original query: I stopped it after 12 minutes

Runtime convert to varchar: 45 seconds

Runtime floor/ceiling: 25 seconds