SQL Clone
SQLServerCentral is supported by Redgate
 
Log in  ::  Register  ::  Not logged in
 
 
 


Performance of Joins over Updates


Performance of Joins over Updates

Author
Message
dwilliscp
dwilliscp
SSC Eights!
SSC Eights! (804 reputation)SSC Eights! (804 reputation)SSC Eights! (804 reputation)SSC Eights! (804 reputation)SSC Eights! (804 reputation)SSC Eights! (804 reputation)SSC Eights! (804 reputation)SSC Eights! (804 reputation)

Group: General Forum Members
Points: 804 Visits: 775
I was wondering if anyone knows which is faster, to build data for my reporting table..

To write new data using three outer joins or to use one outer join, and get most of the data, then use three updates to load the other three columns?

Granted there is not a lot of rows being loaded.. about 500K each night, 70MB of data. The linked tables are not large.. except for the material table that has several mil rows.
Michael Valentine Jones
Michael Valentine Jones
SSCertifiable
SSCertifiable (5.7K reputation)SSCertifiable (5.7K reputation)SSCertifiable (5.7K reputation)SSCertifiable (5.7K reputation)SSCertifiable (5.7K reputation)SSCertifiable (5.7K reputation)SSCertifiable (5.7K reputation)SSCertifiable (5.7K reputation)

Group: General Forum Members
Points: 5670 Visits: 11771
You haven't provided enough information for anyone to begin to answer that question.

You should just try each way to see which is faster.
dwilliscp
dwilliscp
SSC Eights!
SSC Eights! (804 reputation)SSC Eights! (804 reputation)SSC Eights! (804 reputation)SSC Eights! (804 reputation)SSC Eights! (804 reputation)SSC Eights! (804 reputation)SSC Eights! (804 reputation)SSC Eights! (804 reputation)

Group: General Forum Members
Points: 804 Visits: 775
Michael Valentine Jones (9/7/2012)
You haven't provided enough information for anyone to begin to answer that question.

You should just try each way to see which is faster.





The problem with that is.. before the second run I would need to make sure the data is flushed from memory... not sure how to do that. Otherwise the second run should always be faster since it does not have hard drive I/O.
ChrisM@Work
ChrisM@Work
SSCoach
SSCoach (15K reputation)SSCoach (15K reputation)SSCoach (15K reputation)SSCoach (15K reputation)SSCoach (15K reputation)SSCoach (15K reputation)SSCoach (15K reputation)SSCoach (15K reputation)

Group: General Forum Members
Points: 15988 Visits: 19524
dwilliscp (9/7/2012)
Michael Valentine Jones (9/7/2012)
You haven't provided enough information for anyone to begin to answer that question.

You should just try each way to see which is faster.





The problem with that is.. before the second run I would need to make sure the data is flushed from memory... not sure how to do that. Otherwise the second run should always be faster since it does not have hard drive I/O.


Then do several runs - A after A, A after B, B after A, B after B.
I've rarely seen an UPDATE to an intermediate table, as you describe, perform faster than a straight SELECT. It's most often seen when the developer has missed something.

“Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden
Exploring Recursive CTEs by Example Dwain Camps
Michael Valentine Jones
Michael Valentine Jones
SSCertifiable
SSCertifiable (5.7K reputation)SSCertifiable (5.7K reputation)SSCertifiable (5.7K reputation)SSCertifiable (5.7K reputation)SSCertifiable (5.7K reputation)SSCertifiable (5.7K reputation)SSCertifiable (5.7K reputation)SSCertifiable (5.7K reputation)

Group: General Forum Members
Points: 5670 Visits: 11771
dwilliscp (9/7/2012)
Michael Valentine Jones (9/7/2012)
You haven't provided enough information for anyone to begin to answer that question.

You should just try each way to see which is faster.





The problem with that is.. before the second run I would need to make sure the data is flushed from memory... not sure how to do that. Otherwise the second run should always be faster since it does not have hard drive I/O.



CHECKPOINT;
DBCC DROPCLEANBUFFERS;


Alexander Suprun
Alexander Suprun
Mr or Mrs. 500
Mr or Mrs. 500 (509 reputation)Mr or Mrs. 500 (509 reputation)Mr or Mrs. 500 (509 reputation)Mr or Mrs. 500 (509 reputation)Mr or Mrs. 500 (509 reputation)Mr or Mrs. 500 (509 reputation)Mr or Mrs. 500 (509 reputation)Mr or Mrs. 500 (509 reputation)

Group: General Forum Members
Points: 509 Visits: 1516
dwilliscp (9/7/2012)
To write new data using three outer joins or to use one outer join, and get most of the data, then use three updates to load the other three columns?
Usually the 1st approach is faster.
If any of these 3 columns have variable length (say varchar) then your update statement will cause a lot of page splits and therefore fragmentation.


Alex Suprun
dwilliscp
dwilliscp
SSC Eights!
SSC Eights! (804 reputation)SSC Eights! (804 reputation)SSC Eights! (804 reputation)SSC Eights! (804 reputation)SSC Eights! (804 reputation)SSC Eights! (804 reputation)SSC Eights! (804 reputation)SSC Eights! (804 reputation)

Group: General Forum Members
Points: 804 Visits: 775
Alexander Suprun (9/7/2012)
dwilliscp (9/7/2012)
To write new data using three outer joins or to use one outer join, and get most of the data, then use three updates to load the other three columns?
Usually the 1st approach is faster.
If any of these 3 columns have variable length (say varchar) then your update statement will cause a lot of page splits and therefore fragmentation.


I did run some tests.. but the results were not steady.. there is a lot of activity on this box. The updates do tend to have less variation, thus it can take 5% longer or up to 20% less time.. depending on the test. Again due to the load on the box, I just do not put any faith in the tests.

All the fields from the join are 50 - 200 varchar. So that could explain why this had the widest variation of run times. So since we have I/O pressure.. on this box and the production one this will get released too, I am going to go with the updates.
ChrisM@Work
ChrisM@Work
SSCoach
SSCoach (15K reputation)SSCoach (15K reputation)SSCoach (15K reputation)SSCoach (15K reputation)SSCoach (15K reputation)SSCoach (15K reputation)SSCoach (15K reputation)SSCoach (15K reputation)

Group: General Forum Members
Points: 15988 Visits: 19524
dwilliscp (9/10/2012)
Alexander Suprun (9/7/2012)
dwilliscp (9/7/2012)
To write new data using three outer joins or to use one outer join, and get most of the data, then use three updates to load the other three columns?
Usually the 1st approach is faster.
If any of these 3 columns have variable length (say varchar) then your update statement will cause a lot of page splits and therefore fragmentation.


I did run some tests.. but the results were not steady.. there is a lot of activity on this box. The updates do tend to have less variation, thus it can take 5% longer or up to 20% less time.. depending on the test. Again due to the load on the box, I just do not put any faith in the tests.

All the fields from the join are 50 - 200 varchar. So that could explain why this had the widest variation of run times. So since we have I/O pressure.. on this box and the production one this will get released too, I am going to go with the updates.


Why not post the actual plan for both versions here?

“Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden
Exploring Recursive CTEs by Example Dwain Camps
Go


Permissions

You can't post new topics.
You can't post topic replies.
You can't post new polls.
You can't post replies to polls.
You can't edit your own topics.
You can't delete your own topics.
You can't edit other topics.
You can't delete other topics.
You can't edit your own posts.
You can't edit other posts.
You can't delete your own posts.
You can't delete other posts.
You can't post events.
You can't edit your own events.
You can't edit other events.
You can't delete your own events.
You can't delete other events.
You can't send private messages.
You can't send emails.
You can read topics.
You can't vote in polls.
You can't upload attachments.
You can download attachments.
You can't post HTML code.
You can't edit HTML code.
You can't post IFCode.
You can't post JavaScript.
You can post emoticons.
You can't post or upload images.

Select a forum

































































































































































SQLServerCentral


Search