What is RBAR? How can I avoid it?

  • bjesmith2

    SSC Veteran

    Points: 279

    Hi,

    I'm new to t-sql and I do not know much regarding how to write efficient queries.

    I read an article which referred to RBAR and set based programming (I hope I'm referring to this method right) and I would like to learn more. Can anyone recommend an article or post explains what set based programming is?

  • Sankar Reddy

    SSCrazy

    Points: 2410

    Set based solutions tend to be faster than the RBAR operations while taking less system resources. Doing a task 100 times is much likely to take more time than doing one task that does 100 actions.

    RBAR operations have thier place in administration but not in OLTP systems. Set based solutions often appear to be complex but once you know the SQL, it will be easy. Mostly newcomers tend to use RBAR operations because of thier relative ease in using them. These two articles might help you get started but you may have follow-up questions. Let us know if we can help in answering any specific questions you may have.

    http://www.simple-talk.com/sql/t-sql-programming/rbar--row-by-agonizing-row/

    http://code.msdn.microsoft.com/SQLExamples/Wiki/View.aspx?title=AlternativesToSQLCursors&referringTitle=Home

    [font="Courier New"]Sankar Reddy | http://SankarReddy.com/[/url][/font]

  • Garadin

    One Orange Chip

    Points: 29613

    Loops, cursors, complex correlated subqueries, inline functions with data access, functions in the where clause (I believe, don't see this one much)... all are examples of RBAR. In other programming languages, RBR or "Row by Row" processing is fine. If you want to write loops 7 levels deep in VB, go right ahead, it loves them. In SQL, RBR processing gets an additional "A" (courtesy of Jeff Moden) to emphasize how much SQL hates it. In SQL, it is Row by AGONIZING Row, because SQL is not meant to "loop". SQL is meant to do everythingg in a set. There are many articles on this forum that will show you how to properly do set based programming (in fact, almost every situation has a set based method that can be applied to it, so most articles will present these).

    If you're new, the biggest concepts to familiarize yourself with will be CTE's and derived tables. These are relatively simple techniques that can make a HUGE impact on how you write queries.

    Seth Phelabaum


    Consistency is only a virtue if you're not a screwup. 😉

    Links: How to Post Sample Data[/url] :: Running Totals[/url] :: Tally Table[/url] :: Cross Tabs/Pivots[/url] :: String Concatenation[/url]

  • Jeff Moden

    SSC Guru

    Points: 994239

    I agree with Seth and would like to point out a couple of other forms of RBAR to avoid...

    CTE's are a wonderful tool... until you write a recursive CTE which is slower than a cursor or a while loop. Just because it doesn't have an explicitly defined cursor or while loop, it doesn't necessarily mean that it's "set based".

    A good example of "hidden RBAR" comes in the form of what are known as "triangular joins". These insidious monsters of code are actually "RBAR on sterioids" and can be thousands of times worse than any form of cursor or while loop that you can possibly imagine. What makes them insidious is two things...

    1. They look like set based code.

    2. They seem to not only get the job done with small row counts, but they seem to work very fast with small row counts. The problem is, they get exponentially worse as the row count increases much like a "Cartesian Product" (otherwise known as a CROSS JOIN) does.

    For a full explanation on what to look for, please see the following article... generally speaking, inequalities in a JOIN ON or WHERE clause can be (not always) deadly to the performance of a system...

    [font="Arial Black"]Hidden RBAR: Triangular Joins[/font][/url]

    As you've seen in this very thread, lot's of people believe that writing set based code is difficult and that the code gets a lot longer. Patently, not the case. The key to writing set based code is simply to make a pardigm shift... stop thinking about what you want to do to each row and start thinking about what you want to do to an entire column in a set of rows.

    Once you make that shift in thinking, start practicing it... you cannot play the piano well if you never touch the piano. Instead of doing like many do and giving up on a set based solution simply because you can't think of one, struggle a bit and find one. Train your brain to start thinking in terms of how to use a database instead of a GUI.

    To put my money where my mouth is on the length and simplicity of set based code, let's address the lowly "split" function... Here's one way of doing it with a While Loop in a function...

    --Creates an 'InLine' Table Valued Function (TVF)

    CREATE FUNCTION dbo.Split

    ( @Delimiter varchar(5),

    @List varchar(8000)

    )

    RETURNS @TableOfValues table

    ( RowID smallint IDENTITY(1,1),

    [Value] varchar(50)

    )

    AS

    BEGIN

    DECLARE @LenString int

    WHILE len( @List ) > 0

    BEGIN

    SELECT @LenString =

    (CASE charindex( @Delimiter, @List )

    WHEN 0 THEN len( @List )

    ELSE ( charindex( @Delimiter, @List ) -1 )

    END

    )

    INSERT INTO @TableOfValues

    SELECT substring( @List, 1, @LenString )

    SELECT @List =

    (CASE ( len( @List ) - @LenString )

    WHEN 0 THEN ''

    ELSE right( @List, len( @List ) - @LenString - 1 )

    END

    )

    END

    RETURN

    END

    go

    --demo the split

    SELECT *

    FROM dbo.Split( '-', '111111-0000-9999' ) AS s

    And, they call that "simple"... 😉

    Now, consider doing the same thing using a set based method that employs a "helper table" known as the "Tally" or "Numbers" table...

    --===== Creates an 'InLine' Table Valued Function (TVF) that will use

    -- up to a 5 character delimiter

    CREATE FUNCTION dbo.TSplit

    (

    @Delimiter VARCHAR(5),

    @List VARCHAR(8000)

    )

    RETURNS @TableOfValues TABLE

    (

    RowID SMALLINT IDENTITY(1,1) PRIMARY KEY CLUSTERED,

    [Value] VARCHAR(8000)

    )

    AS

    BEGIN

    --===== Now, do the split using the high speed set based method made

    -- possible by a Tally table. The single character delimiter of

    -- CHAR(1) replaces the original (up to) 5 character delimiter.

    INSERT INTO @TableOfValues

    ([Value])

    SELECT SUBSTRING(L.List,N+1,CHARINDEX(CHAR(1),L.List,N+1)-N-1)

    FROM dbo.Tally t,

    (--=== Replace the original delimiter with single delimiter

    SELECT CHAR(1) + REPLACE(@List,@Delimiter,CHAR(1)) + CHAR(1) AS List

    )L

    WHERE t.N < LEN(L.List)

    AND SUBSTRING(L.List,t.N,1)=CHAR(1)

    ORDER BY t.N

    RETURN

    END

    GO

    --===== Demo the split

    SELECT *

    FROM dbo.TSplit( '-', '111111-0000-9999' ) AS s

    (Side bar: There's a way of using FOR XML to split things in a set based fashion that doesn't use a Tally table that's also nasty fast... it'll be in the final two links I post on this thread).

    Now, as you look at that little slice of computational heaven, you'll make the realization that there's another part to writing good, high performance set based code other than just thinking in columns... you actually have to know what SQL Server does. 😉 The very first thing I did when I first started learning SQL (or any language, for that matter) is to study the "meat" of the language and that "meat" is in the form of what the functions do. You don't have to memorize the exact syntax of every function (I have to lookup how STUFF works, every time), but at least know what can be done so that when something comes up, you can figure out how to do things with a function or two instead of having to resort to RBAR. In the databases for the company I'm currently working for, I actually found a user defined function to calculate MOD (Modulo) because the dummies that wrote the code didn't know that there's a MOD operator built into SQL Server.

    So... I simply wouldn't spend to much time trying to learn about RBAR... rather, spend the time practicing how to avoid it and spend the time to learn about SQL Server instead of just becoming another "SQL Drone Clone". Spend some time answering questions on this forum and read all the articles that come out... then, try to figure out a better way.

    For more information on the "Tally" table I used in the code above, please see the following article on how it's made and how it works.

    [font="Arial Black"]The "Numbers" or "Tally" Table: What it is and how it replaces a loop.[/font][/url]

    For more on how to do "splits" including the FOR XML method, please see the following...

    [font="Arial Black"]Passing Parameters as (almost) 1, 2, and 3 Dimensional Arrays[/font][/url]

    Don't forget to "practice". 😉

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column.
    "If you think its expensive to hire a professional to do the job, wait until you hire an amateur."--Red Adair
    "Change is inevitable... change for the better is not."
    When you put the right degree of spin on it, the number 3|8 is also a glyph that describes the nature of a DBAs job. 😉

    Helpful Links:
    How to post code problems
    Create a Tally Function (fnTally)

  • Jeff Moden

    SSC Guru

    Points: 994239

    By the way... the following article on "Alternatives to SQL Server Cursors"...

    http://code.msdn.microsoft.com/SQLExamples/Wiki/View.aspx?title=AlternativesToSQLCursors&referringTitle=Home

    ... sucks. In one case, they use a WHILE LOOP instead of a cursor and in other they use an ordered correlated sub-query. Both are forms of RBAR to avoid. For the record, a "firehose" cursor (read only, forward only or "static) is just as fast as a while loop and can be easier to use. I'm, of course, not suggesting you actually use either. 😉

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column.
    "If you think its expensive to hire a professional to do the job, wait until you hire an amateur."--Red Adair
    "Change is inevitable... change for the better is not."
    When you put the right degree of spin on it, the number 3|8 is also a glyph that describes the nature of a DBAs job. 😉

    Helpful Links:
    How to post code problems
    Create a Tally Function (fnTally)

  • Jeff Moden

    SSC Guru

    Points: 994239

    Sankar Reddy (1/24/2009)


    Set based solutions tend to be faster than the RBAR operations while taking less system resources. Doing a task 100 times is much likely to take more time than doing one task that does 100 actions.

    RBAR operations have thier place in administration but not in OLTP systems. Set based solutions often appear to be complex but once you know the SQL, it will be easy. Mostly newcomers tend to use RBAR operations because of thier relative ease in using them. These two articles might help you get started but you may have follow-up questions. Let us know if we can help in answering any specific questions you may have.

    http://www.simple-talk.com/sql/t-sql-programming/rbar--row-by-agonizing-row/

    http://code.msdn.microsoft.com/SQLExamples/Wiki/View.aspx?title=AlternativesToSQLCursors&referringTitle=Home%5B/quote%5D

    Just a friendly note, Sankar... if you actually believe that RBAR is a bad thing, then you have to stop posting code that uses it even if it's not yours. 😉 Case in point...

    http://www.sqlservercentral.com/Forums/Topic641682-338-1.aspx#bm642949

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column.
    "If you think its expensive to hire a professional to do the job, wait until you hire an amateur."--Red Adair
    "Change is inevitable... change for the better is not."
    When you put the right degree of spin on it, the number 3|8 is also a glyph that describes the nature of a DBAs job. 😉

    Helpful Links:
    How to post code problems
    Create a Tally Function (fnTally)

  • Sankar Reddy

    SSCrazy

    Points: 2410

    Jeff, point taken. I will avoid that next time. I was just demonstrating a technique.

    [font="Courier New"]Sankar Reddy | http://SankarReddy.com/[/url][/font]

  • Jeff Moden

    SSC Guru

    Points: 994239

    Sankar Reddy (1/24/2009)


    Jeff, point taken. I will avoid that next time. I was just demonstrating a technique.

    Heh... understood and thanks. I was picking on you a bit I don't want a newbie that reads that other post to think it's the right way just because one of us posts it. That, and the sheer irony of it all. 😉

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column.
    "If you think its expensive to hire a professional to do the job, wait until you hire an amateur."--Red Adair
    "Change is inevitable... change for the better is not."
    When you put the right degree of spin on it, the number 3|8 is also a glyph that describes the nature of a DBAs job. 😉

    Helpful Links:
    How to post code problems
    Create a Tally Function (fnTally)

  • Jeff Moden

    SSC Guru

    Points: 994239

    bjesmith2 (1/23/2009)


    Hi,

    I'm new to t-sql and I do not know much regarding how to write efficient queries.

    I read an article which referred to RBAR and set based programming (I hope I'm referring to this method right) and I would like to learn more. Can anyone recommend an article or post explains what set based programming is?

    So... how'd we do? 😉

    Since you're a bit new to T-SQL and these fine forums, let me welcome you to both. There's definately two different ways to write SQL and you alluded to probably the single most important question there is. The problem is, it's a huge answer that continues to grow with every thread on this forum. I talked a bit about the paradigm shift one has to make to achieve the set based programming state of mind, but doing things like "looping without looping" will take a bit of study on your part. And, I have to tell you, it's absolutely worth it. Kinda fun, too! There's nothing better than feeling you get when you're able to tell your boss something like "Yep... fixed that 24 hour run... now it does 50% more work and it only takes 15 minutes to run. What's next?" (True story, by the way) 😉

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column.
    "If you think its expensive to hire a professional to do the job, wait until you hire an amateur."--Red Adair
    "Change is inevitable... change for the better is not."
    When you put the right degree of spin on it, the number 3|8 is also a glyph that describes the nature of a DBAs job. 😉

    Helpful Links:
    How to post code problems
    Create a Tally Function (fnTally)

  • steve dassin

    SSCrazy

    Points: 2396

    What users here call 'RBAR' is simply a form of redundancy. In certain

    situations the system isn't smart enough to use prior information and

    starts the same process over and over again. A 'picture' of it is given here:

    "Visualizing a ranking query"

    http://beyondsql.blogspot.com/2007/06/dataphor-sql-visualizing-ranking-query.html

    The sql answer to much of 'RBAR' is applying a transformation to the table to derive appropriate values. The analytic ranking functions in sql server imply a cursor over the table. This allows rows to be processed in 'order' and eliminates redundancy (a single pass over the table). More here:

    "The Sql ranking OVERture"

    http://beyondsql.blogspot.com/2008/04/sql-ranking-overture.html

    For many developers RBAR perhaps could be an acronym for:

    R(ight) B(ack) (where you started from) A(fter) R(iting) (the same query):)

    www.beyondsql.blogspot.com

  • bjesmith2

    SSC Veteran

    Points: 279

    To Everyone,

    Thank you for the abundance of replies (I really did not expect to see so many responses, so quickly). I truly appreciate all of the suggestions, article references and examples of code each of you shared.

    I will post responses for each of your replys. However, I ask you allow me a little time to read and "digest" everything first.

    Again, I thank all of you for your kindness and generosity (and I promise to watch for post replies a little more carefully!).

  • The Dixie Flatline

    SSC Guru

    Points: 53197

    For many developers RBAR perhaps could be an acronym for:

    R(ight) B(ack) (where you started from) A(fter) R(iting) (the same query)

    Frankly, I get the visual impact of "Row By Agonizing Row" much more clearly. 😉

    __________________________________________________

    Against stupidity the gods themselves contend in vain. -- Friedrich Schiller
    Stop, children, what's that sound? Everybody look what's going down. -- Stephen Stills

  • bjesmith2

    SSC Veteran

    Points: 279

    Jeff,

    All of your responses were excellent! I was astonished by the number of replys to what I thought would be an afterthought question for most of you. I truly appreciate all of the posts.

    Also, In your last post you mention the "shift" in thinking that's needed to use set based programming. This shift is exactly what I'm looking for.

    My primary job is to pushing out reports through SSRS or whatever else is thrown at me. Therefore, to have quick running reports, I'm always looking for ways to make queries that run as quickly as possble.

    However, I must admit all of my queries must have RBAR (I just didn't know). When I came across your article describing RBAR in triangular joins (which I've used with smaller datasets) and set based programming. You showed me there is a better way.

    As I mentioned in my previous reply. I need a little time to read all of the posts and try out a few things with my reports. However, I look forward to sharing how this all works out with each of you. Thanks again.

  • Jeff Moden

    SSC Guru

    Points: 994239

    Thanks for the feedback. And, as you can see, avoiding RBAR is quite a hot topic with most of the power users.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column.
    "If you think its expensive to hire a professional to do the job, wait until you hire an amateur."--Red Adair
    "Change is inevitable... change for the better is not."
    When you put the right degree of spin on it, the number 3|8 is also a glyph that describes the nature of a DBAs job. 😉

    Helpful Links:
    How to post code problems
    Create a Tally Function (fnTally)

  • Lynn Pettis

    SSC Guru

    Points: 442118

    Cardinal Moden is an excellent resource in the fight against RBAR. I must admit to having learned quite a bit from his excellent mentoring via SSC.

Viewing 15 posts - 1 through 15 (of 23 total)

You must be logged in to reply to this topic. Login to reply