December 5, 2007 at 9:17 pm
Comments posted to this topic are about the item Hidden RBAR: Triangular Joins[/url]
First, thank you for reading my article and any feedback that you my post... good or bad. The exchange of ideas is how we all learn and your feedback is certainly a part of that process and is most welcome.
Second, there's a bit of a bug in the forum software... when they republish an article, it changes the date on the article. This article was originally published on 2007-12-06. The webmaster has been advised and there's nothing they can do about it for the moment. I appologize if this leads to any confusion and I'd still like to hear your comments even if you may have seen the article before.
Third, Yes, I actually did do a followup on the running total problem. The link that follows will take you there, but be advised... although the high speed update method that uses an index hint does always work, don't use the method as a substitute for an ORDER BY on SELECTs. I'm working on a rewrite of the article that explains why and hope to have it out soon. Also be advised that if you don't follow the instructions that I've given to a "T", you could get wrong answers... of course, that would be true of any code, wouldn't it? Here's the link...
http://www.sqlservercentral.com/articles/Advanced+Querying/61716/
Last, but not least, if you're interested in other articles that I've written, please see the following link... I'd be very interested in your comments on those, as well
http://www.sqlservercentral.com/Authors/Articles/Jeff_Moden/80567/
Thanks again, folks. I'm humbled by all of your responses and the time you spent writing them up.
--Jeff Moden
Change is inevitable... Change for the better is not.
December 5, 2007 at 10:28 pm
So... the major problem with your article is that it posts no solutions for the problem at hand, or no pointers to where to look further to increase one's kungfu skills in this area.
I can't find one of my JCelko books that talks about calc'ing running sums in SQL and how he or some of his readers do it, but that would be one good* place to look ("SQL for Smarties" etc).
(* I realize that some people out there have a definite love-hate..no, hate relationship with JCelko's scribblings...but I don't, so feh. But if you have other resources, do add them to the thread).
December 5, 2007 at 10:45 pm
Heh... sorry about that, but at least I got your attention....:D
I should have mentioned that I'm working on an article (may end up being two) that covers a high speed method to do running totals and the like in SQL Server 2000... I finally got 2k5 and will have to load it before comparing the methods to the RANK methods for running counts performance wise.
--Jeff Moden
Change is inevitable... Change for the better is not.
December 5, 2007 at 11:52 pm
Cool. It would be good to see it.
I came to my current job using SS2000 after using Oracle 9i. Although slightly obtuse and wordy, Oracle's partitioning for aggregate functions seemed to make certain problems in this class much easier to work around, too.
December 6, 2007 at 12:24 am
Hi ...
Thats an interesting article and the issues you bring up seem to have merit. If you plan to give examples of problems, I'd recommend you also provide the solutions in the same article.
ciao
December 6, 2007 at 12:40 am
Heh... lost leader for one or two articles coming up... forgot to mention that in the article... sorry about that.
--Jeff Moden
Change is inevitable... Change for the better is not.
December 6, 2007 at 1:49 am
Hi Jeff,
thanks for this article.
A solution to this problem is to use procedural code (at least for SQL2K).
Probably not the intention of the author...:D
-- Declarations
DECLARE @tblResult TABLE (asset_id char(9), sum_amt decimal(19,3), count_amt int)
DECLARE @asset_id char(9), @sum_amt decimal(19,3), @count_amt int
SELECT @asset_id = ''-- This is our iteration key
SET @sum_amt = 0-- Will be incremented for each record
SET @count_amt = 0-- Will be incremented for each record
WHILE 1=1 BEGIN-- Loop over all asset_ids
-- Load the next asset_id and add amount to running total var and increase count for the new record.
SELECT TOP 1 @asset_id = asset_id, @sum_amt = @sum_amt + cap_issue_amt, @count_amt = @count_amt + 1
FROM dbo.RRefasset WHERE asset_id > @asset_id ORDER BY asset_id ASC
IF @@ROWCOUNT = 0 BREAK
-- Insert new row into temporary results table
INSERT @tblResult SELECT @asset_id, @sum_amt , @count_amt
END
SELECT asset_id, sum_amt, count_amt FROM @tblResult
This runs in 3 seconds on my machine for final 40000 rows.
If I understand this correctly, with the triangle there is no perfect execution plan since the engine starts with matching only a few rows ("index seek class") and then ends up with matching the full table ("table scan class") for the last rows. Do you have more details on this?
(Edit: Added linebreak in code to avoid scrolling)
Best Regards,
Chris BΓΌttner
December 6, 2007 at 2:32 am
My immediate reaction was "Interesting, but where's the hint as to how to avoid/mitigate the problem?". I have a real-world case that suffers from poor performance and this has pointed me back to an area which I already knew caused a problem, but hadn't had a chance to focus on. I look forward to seeing your later articles.
In case you're interested, a simplified explanation of problem is invoices that get queries raised. If there's no query, the status goes straight from 'F' (Finance) to 'C' (Complete)' If there's a query, status goes from 'F' to 'P' (Client - I don't know the history of why it's 'P'), then later back to 'F'. Once back at 'F' it may go to 'C' or 'P' again. The data extracted from the business system simply has a list of invoice numbers, dates/times and statuses (plus other info not relevant to this problem). I need to get the turnaround time for query responses, i.e. the difference in date/time for each invoice with a 'P'->'F' status change. Historicaly, other status changes have been tracked, so we haven't preselected on particular statuses and the current view has at its core has a query similar to:
select
a.invoice_id,
a.status as 'stA',
a.stdate as 'dtA',
b.status as 'stB',
min(b.stdate) as 'dtB'
from
workflow a
join workflow b
on a.invoice_id = b.invoice_id
and a.stdate < b.stdate
and a.status <> b.status
group by
a.invoice_id,
a.status as 'statusA',
a.stdate as 'stdateA',
b.status as statusB
I already knew from the execution plan that this was a killer for the performance of this query. Now I know I need to revisit it.
Any suggestions wlecome! π
Derek
December 6, 2007 at 4:47 am
RBAR - also known as 'slow-by-slow'.
December 6, 2007 at 5:37 am
Jeff Moden (12/5/2007)
....... I finally got 2k5 and will have to load it before comparing the methods to the RANK methods for running counts performance wise.
Well it's about time ! :w00t:
______________________________________________________________________
Personal Motto: Why push the envelope when you can just open it?
If you follow the direction given HERE[/url] you'll likely increase the number and quality of responses you get to your question.
Jason L. SelburgDecember 6, 2007 at 6:52 am
That was an excellent article. I look forward to seeing the workaround!
Thanks.
December 6, 2007 at 6:59 am
Hi Jeff,
thanks for this article.
A solution to this problem is to use procedural code (at least for SQL2K).
Probably not the intention of the author... π
Thanks for the feedback and the code, Christian. Heh, yeah, as you can see, the WHILE loop is much, much faster in this case... using triangular joins would cause a run that you'd probably cancel after 20 minutes...
I'll be sure to include some "timings" in the upcoming article(s).
--Jeff Moden
Change is inevitable... Change for the better is not.
December 6, 2007 at 7:05 am
"Interesting, but where's the hint as to how to avoid/mitigate the problem?"...
In case you're interested, a simplified explanation of problem is invoices that get queries raised.
Yeaup.... that's where I first ran into similar problems... millions of invoices...
I look forward to seeing your later articles.
Thanks, Derek... sorry for the "teaser".
--Jeff Moden
Change is inevitable... Change for the better is not.
December 6, 2007 at 7:08 am
Heh... that makes a pretty cool sounding acronymn, too... "SbS" π
--Jeff Moden
Change is inevitable... Change for the better is not.
December 6, 2007 at 7:09 am
Jason Selburg (12/6/2007)
Jeff Moden (12/5/2007)
....... I finally got 2k5 and will have to load it before comparing the methods to the RANK methods for running counts performance wise.Well it's about time ! :w00t:
Heh... that what it's always been about... "time"... π
--Jeff Moden
Change is inevitable... Change for the better is not.
Viewing 15 posts - 1 through 15 (of 255 total)
You must be logged in to reply to this topic. Login to reply