Understanding and Using APPLY (Part 2)

  • Mike and Green Clay:

    Yes the ORDER BY in the CROSS JOIN example should be as you say - oddly it is only the image in the article that has this typo, the downloadable code is correct.

    Paul

  • More great stuff Paul, thanks!

    [font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung[font="Arial Black"]
    Proactive Performance Solutions, Inc.
    [/font]
    [font="Verdana"] "Performance is our middle name."[/font]

  • Thank you Paul!

    You post it just when I needed to use the CROSS APPLY to process a XML file 🙂

  • Always great stuff from you, Paul.

    Well done.

    -- Gianluca Sartori

  • Nice article, Paul. With this and the previous one I think I'm beginning to understand apply (a feature I haven't yet used, but probably will sometime soon now that you have shown me how it can be used to express things in a nice clean modular fashion).

    Tom

  • After reading Part I last week I knew that Part II is going to be good, but I did not expect it to be this good! This was easy to understand, yet comprehensive and inviting further digging introduction to APPLY. Too bad that at work I have to work with SQL server 2005 at compat 80 databases, so utilizing it is not yet possible, but the day will come.

    Excellent work Paul, thank you.

    Oleg

  • Oleg Netchaev (4/20/2010)


    After reading Part I last week I knew that Part II is going to be good, but I did not expect it to be this good!

    Thank you very much! I hope you manage to break free from 2000-compatibility mode soon.

    My thanks also to Tom Thompson, Gianluca, JoseP, Barry, UMG, Lutz, Steve, Jason, ckuroda, Tom Garth, Trey, Wayne, dbowlin, Clement, and Peter for all the kind comments.

    I am considering a Part III, to cover some specific cases where APPLY can produce solutions which are more efficient than any other method...we'll see.

    Paul

  • Paul White NZ (4/20/2010)


    I am considering a Part III, to cover some specific cases where APPLY can produce solutions which are more efficient than any other method...we'll see.

    Paul

    Wow, thats a great news... APPLY is something i have not applied at all in any of my code and now i think i will start applying it 😀

    Thanks Paul for nice pieces of cake, waiting to eat the third piece soon :hehe:.Great job!!

    ~Edit : Cleared some mess i made in the Quote!

  • I too would like to see more information on exactly what is required for the optimizer to be able to translate APPLY to a JOIN so it can use something other than NESTED LOOP joins. I have cleaned up some performance messes where devs went hog-wild on APPLIES and got crushed with kajillion-row loops. But I don't use nor know enough about APPLY to be as helpful as a consultant as I should be. I definitely look forward to the third article you mention!!

    BTW Paul, would you mind if I took this material and made a presentation out of it to give at user group meetings and SQL Saturdays? I would of course give you credit for the work!!

    Best,
    Kevin G. Boles
    SQL Server Consultant
    SQL MVP 2007-2012
    TheSQLGuru on googles mail service

  • TheSQLGuru (4/22/2010)


    I too would like to see more information on exactly what is required for the optimizer to be able to translate APPLY to a JOIN so it can use something other than NESTED LOOP joins. I have cleaned up some performance messes where devs went hog-wild on APPLIES and got crushed with kajillion-row loops. But I don't use nor know enough about APPLY to be as helpful as a consultant as I should be. I definitely look forward to the third article you mention!!

    A kajillion is quite a lot, yes 😉

    Thanks for you your comments there - some very interesting ideas!

    I had not originally planned to write a third part, but perhaps I underestimated the level of interest.

    Just finishing off another series today, but I will look at this next.

    BTW Paul, would you mind if I took this material and made a presentation out of it to give at user group meetings and SQL Saturdays? I would of course give you credit for the work!!

    Of course - I would be flattered if you could find such a use for it. If the original articles in Word format would be of use to you, please mail me. Thanks again!

    Paul

  • The only thing not mentioned, or emphasized in this is the performance of using APPL.

    For tiny databases / small result sets, this is awesome. For large result sets, it's terrible.

    The problem is that the APPLY operator applies the function to each row individually. If you watch SQL profiler while you're running your query / stored proc that uses apply, you'll see a call to the function that's applied for each row. Row by row processing is far slower than pulling all of the data in one swell foop in a single query. I've seen reports that use the apply operator that have nearly taken down a server due to the row by row processing. As soon as we axed the apply operator and put in a normal join onto the table (or tables) that it referred to, the report suddenly became zippy despite pulling back a huge data set.

    Therefore, I suggest you use caution when using APPLY and user defined functions in your code because although non-SQL coders think they're cool reuseable code, in reality they can seriously hurt performance for larger databases.

  • LadyRuna (1/13/2012)


    The only thing not mentioned, or emphasized in this is the performance of using APPLY.

    For tiny databases / small result sets, this is awesome. For large result sets, it's terrible.

    The problem is that the APPLY operator applies the function to each row individually. If you watch SQL profiler while you're running your query / stored proc that uses apply, you'll see a call to the function that's applied for each row. Row by row processing is far slower than pulling all of the data in one swell foop in a single query. I've seen reports that use the apply operator that have nearly taken down a server due to the row by row processing. As soon as we axed the apply operator and put in a normal join onto the table (or tables) that it referred to, the report suddenly became zippy despite pulling back a huge data set.

    Therefore, I suggest you use caution when using APPLY and user defined functions in your code because although non-SQL coders think they're cool reuseable code, in reality they can seriously hurt performance for larger databases.

    This is true, but the blame lies with scalar and multi-statement T-SQL functions that access data, not APPLY. In-line functions with APPLY are highly recommended in part 1 for this reason. My advice is to be very careful about using T-SQL scalar or multi-statement functions at all - especially if they access data. In general though, APPLY is an enormously powerful tool that every developer should learn to use effectively.

  • Very good article Paul.

    the explanation is very good and easy understanding.

    Thanks!


    [font="Times New Roman"]rfr.ferrari[/font]
    DBA - SQL Server 2008
    MCITP | MCTS

    remember is live or suffer twice!
    the period you fastest growing is the most difficult period of your life!
  • I've been learning CROSS APPLY lately and found some interesting things. I have a query that can easily be written as a JOIN (Sorry, it seems my indentation is lost here):

    select *

    From OR_NCCore.dbo.Order_Dim nco

    INNER JOIN or_nccore.dbo.STATUS_DIM S

    ON nco.STATUS_ID = S.STATUS_ID

    inner join (

    select *, rn=row_number()

    over (partition by order_id order by order_item_id asc)

    from or_nccore.dbo.ORDER_ITEM_DIM

    ) nci

    on nco.order_id=nci.order_id and nci.rn = 1

    I can also write it as a CROSS APPLY:

    select *

    From OR_NCCore.dbo.Order_Dim nco

    INNER JOIN or_nccore.dbo.STATUS_DIM S

    ON nco.STATUS_ID = S.STATUS_ID

    cross apply (

    select MAX(order_item_id) max_id

    from or_nccore.dbo.ORDER_ITEM_DIM

    where nco.order_id=order_id

    having MAX(order_item_id) is not null

    ) nci

    The first interesting thing is that I have to add the "having" clause in the CROSS APPLY or else it returns null values for non-matching order_ids (but then, that's how aggregates work).

    The other interesting thing is that there are substantial differences in the CPU time and IO:

    Join:

    (70087 row(s) affected)

    Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    Table 'STATUS_DIM'. Scan count 1, logical reads 10, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    Table 'ORDER_DIM'. Scan count 1, logical reads 4124, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    Table 'ORDER_ITEM_DIM'. Scan count 1, logical reads 3002, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    SQL Server Execution Times:

    CPU time = 2766 ms, elapsed time = 23370 ms.

    Cross Apply:

    (70087 row(s) affected)

    Table 'Worktable'. Scan count 70126, logical reads 448634, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    Table 'ORDER_ITEM_DIM'. Scan count 1, logical reads 1394, physical reads 1, read-ahead reads 1, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    Table 'ORDER_DIM'. Scan count 1, logical reads 4124, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    Table 'STATUS_DIM'. Scan count 1, logical reads 10, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    SQL Server Execution Times:

    CPU time = 1625 ms, elapsed time = 9035 ms.

    So, the CROSS APPLY runs faster but does tons more I/O to Worktable. Why would this be?

  • Try TOP (1) ... ORDER BY inside the APPLY instead of the MAX...HAVING. Happy to explain the performance difference if you can attach the actual execution plans for both to your reply.

Viewing 15 posts - 16 through 30 (of 51 total)

You must be logged in to reply to this topic. Login to reply