Hidden RBAR: Triangular Joins

  • On the other hand, if your goal is for elegant well written code above all, especially in a more purely theoretical setting, then you can largely if not completely ignore the physical aspects.

    My definition of "elegant" and "well written" code are not the same nor even my goal... My single goal is to write code that is both performance enabled and scalable. That means that if the number of rows processed doubles, the duration and resources used by the code will do no more than double. Well written set based code may not even double.

    I respectfully submit that if you ignore the physical impact of the code, then your code will be perfomance challenged. In my poor old eyes, true set based code will never be performance challenged and the duration/resources used will never outstripe linearity with respect to rowcounts.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".
    "Dear Lord... I'm a DBA so please give me patience because, if you give me strength, I'm going to need bail money too!"

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Thanks Jeff.

    So u mean to say that there is a triangular join in my example. Since there are other conditons in the where clause (i did not put them in the example) this would be ok. but i need to be carefull..

    "Keep Trying"

  • TheSQLGuru (12/6/2007)


    I must say that I am very happy that people can do things like the triangular join - and sad that Jeff may make less people do it. Lost work opportunities for me!! :hehe:

    Probably not. The people who hire you to fix things most likely don't read sites like this, or they possibly wouldn't have issues in the first place.

    As a soon-to-be colleague of mine said, after I told him his combined sql dev/performance course would reduce the amount of performance-tuning work available, 'Not at all. There are always idiots in the world'

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • Hi Jeff, Chirag,

    I am not sure if I misunderstood something, but the SQL Statement does NOT utilize a "triangular join":

    Select orderid,orderdate,productid

    from orders O INNER JOIN orderdetails OD

    On O.orderid = OD.OrderId

    WHERE OD.productid <> 30.

    It is only limiting the results in the where clause.

    Take a look at the picture

    The join results in the diagonal from the upper right corner (1,1) to the lower left corner (4,4) which is the white area (not the blue one). The inequality operator in the where clause (lets make it <> 3 for this example) would only reduce the result set by the 3,3 result to the final result set of {1,1;2,2;4,4}. This results in linear complexity.

    Let me know if I misunderstood anything.

    Best Regards,

    Chris Büttner

  • Great article. I agree that there will always be idiots in the world that can only help to make me look good 🙂

    The query in question in the previous message is just a normal inner join - there's no hidden RBAR or anything else going on in it. I think, given the topic, people assumed that the inequality was in the table join, in which case it is almost a catesian product minus a few rows.

    Like Jeff I've only just started to look at SQL 2005 and the extra operations such as OVER, etc... I haven't compared performance methodically but, once you get used to them, they're certainly handy and, if used properly, do not affect performance (eg row numbering that matches the order by of the query *shouldn't* have any impact), but numbering rows in a different order, whilst pretty neat looking, is more work for the engine.

  • Adam Machanic liked a cursor-based solution for running sums:

    http://sqlblog.com/blogs/adam_machanic/archive/2006/07/12/running-sums-redux.aspx

  • I have compared the CURSOR method against the WHILE loop, and on my machine the first one completes in 10.990 seconds for 80000 rows, and the latter completes in 9.460.

    So I would still avoid the cursor even though they are quite close with regards to performance / complexity.

    Best Regards,

    Chris Büttner

  • Nice Article Jeff!!!

    Looking forward to Part 2, 3 & 4....:D

    Before 2K5, many a times, we used the following methodology in our financial applications to get the running count for records. People do argued on this stating that the optimizer would not always correctly update the rows or will not correctly use the specified index. Though, it never had any problems in our applications till now, otherwise we would have been screwed up.;)

    UPDATE #AnyTable

    SET @Total = RunningTotal = SomeValue + @Total

    FROM #AnyTable WITH( INDEX( IX_#SomeIndex_AnyTable ) )

    --Ramesh


  • Christian Buettner (12/7/2007)


    Hi Jeff, Chirag,

    I am not sure if I misunderstood something, but the SQL Statement does NOT utilize a "triangular join":

    Select orderid,orderdate,productid

    from orders O INNER JOIN orderdetails OD

    On O.orderid = OD.OrderId

    WHERE OD.productid <> 30.

    That particular query you just wrote does not make a triangle join, but it's not the same as the query Jeff used in his example. Jeff's example would have been:

    Select orderid,orderdate,productid

    from orders O,orderdetails OD

    WHERE O.orderid <> OD.OrderId

    or if you rewrite it in ANSI-92 notation:

    Select orderid,orderdate,productid

    from orders O

    inner join orderdetails OD

    ON O.orderid <> OD.OrderId

    And in this case- it's not really a triangular join either, since it grows twice as fast. It's essentially a "square join" or cartesian product on n-1 (growth factor (n-1)^2). The difference is that the inequality is in the join criteria.

    ----------------------------------------------------------------------------------
    Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?

  • Be careful Ramesh - I don't believe that 'special' feature of sql server on the UPDATE statement is guaranteed to work in the future. It may also be dependent on some ordering that can't be guaranteed in all situations too. It certainly does rock from a performance standpoint when it is available though!!

    I do agree with the general theme many have that there will always be poor designers/coders/admins out there. I see the same bad stuff at every client I go to. Easy pickings - and gosh does it make me look good! 😎

    Best,
    Kevin G. Boles
    SQL Server Consultant
    SQL MVP 2007-2012
    TheSQLGuru on googles mail service

  • Hi Matt,

    The query I "wrote" was just a copy of the query that Chirag posted.

    Best Regards,

    Chris Büttner

  • Obviously, the "special" UPDATE relies purely on the guaranteed ordering of the rows to work, which I should have been mentioned earlier.

    All of our applications, have a guaranteed ordering of rows and I believe that we can always make the order of the rows assured.

    Performance-wise, I could say it should always perform better than the more genuine approaches (i.e cursors or correlated queries).

    --Ramesh


  • Hi Ramesh,

    what makes you sure that the order of the rows is guaranteed in your example?

    Best Regards,

    Chris Büttner

  • Christian Buettner (12/7/2007)


    Hi Matt,

    The query I "wrote" was just a copy of the query that Chirag posted.

    Oops - i guess I missed that one...sorry!

    ----------------------------------------------------------------------------------
    Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?

  • Christian Buettner (12/7/2007)


    Hi Ramesh,

    what makes you sure that the order of the rows is guaranteed in your example?

    Index

    --Ramesh


Viewing 15 posts - 46 through 60 (of 256 total)

You must be logged in to reply to this topic. Login to reply