SQLServerCentral Article

A few reflections on RBAR by a weary application developer

,

Relational databases are works of wonder. They are chock clanking full of useful structures and amazing tools. However, under their lovely skins, a database is - each and every one of them - merely an abstraction layer. A database merely works to keep track of bits laid down on disk and serves them up when requested. If we didn't have databases, we would have to invent them. Efforts to discard them, for something like Big Data, merely end up implementing different techniques to perform the same myriad and mundane tasks.

Database administrators, DBAs if you will, have much to teach application developers. If we are smart, we listen. They explain to us the intricacies of accessing their data smartly and efficiently. Like the relational databases themselves, database administrators perform many unseen and unappreciated tasks, grooming their indexes and keeping their statistics. They make the software developer's life easier in many and sundry ways.

Like the priests of a monastery, database administrators have a canon, a common list of rules by which they live. Just like the clerical cannon, DBA rules were developed over the course of millennia. These rules have proven their worth in production environments since the beginning of time. Go against these rules, knave, and surely you will burn.

One of these rules is No RBAR. RBAR is the DBA acronym for Row by Agonizing Row. The term intends to communicate just how infuriating and obnoxious the technique is. Opening a cursor in a query and then looping through all of the records in a table to find a single entry might be an extreme RBAR example, one certain to send an otherwise friendly DBA into a flurry of nine fists and flying ninja stars.

Obviously, the friendly DBA has a good point. To me, the RBAR epithet implies a cursor. Cursors are resource intensive, and are hardly ever needed. Take, for example, this—

DECLARE @Sql VARCHAR(MAX) = ''
DECLARE @Idx INT = 0
DECLARE @MaxID INT = 0
DECLARE @TableNames TABLE
    (
      [ID] INT IDENTITY(1, 1)
               NOT NULL
    , [TableName] VARCHAR(MAX) NOT NULL
    )
INSERT @TableNames
        ( [TableName]
        )
        SELECT '[' + OBJECT_SCHEMA_NAME([object_id]) + '].[' + [name] + ']'
            FROM sys.tables
            WHERE [type] = 'U'
 --user table
SET @Idx = ( SELECT MIN([ID])
                FROM @TableNames
           )
SET @MaxID = ( SELECT MAX([ID])
                FROM @TableNames
             )
WHILE @Idx <= @MaxID
    BEGIN
        SET @Sql = 'UPDATE STATISTICS  ' + ( SELECT [TableName]
                                                FROM @TableNames
                                                WHERE [ID] = @Idx
                                           ) + ';'
  --EXEC (@Sql)
        SET @Idx = @Idx + 1
    END

Such a technique, using an incremented, in-memory index instead of a cursor, must surely remove at least some of the objections that a DBA has against RBAR, shouldn't it?

Certainly not—properly done, RBAR is still objectionable. For example, consider the power of the tally table

DECLARE @Parameter VARCHAR(8000)
    SET @Parameter = ',Element01,Element02,Element03,Element04,Element05,'
SELECT N, SUBSTRING(@Parameter,N,1)
   FROM dbo.Tally
  WHERE N <= LEN(@Parameter)
  ORDER BY N

Now that right there is some lovely non-RBAR action, certain to warm the heart and hackles of any right-minded geek.

As I said, if we application developers are smart, we listen to our DBAs. Follow the DBA canon religiously. Think in columns, not rows. Think in sets, not records.

Of course, all of this excellent guidance is true. And yet, please Mr. or Ms. DBA? Won't you please hear my plea?

Consider a daily data file import, 60k rows. The data is loaded, of course, row by agonizing row, into a temp table. The mighty DBA, with his scimitar, strikes a mighty blow and updates the destination tables in one enormous, amazing join. The update runs in production in under three minutes.

Yay! Everyone cries! We weep with the endless possibilities of mighty set based queries!

The DBA declares his work complete, shoulders his mighty scimitar, and departs for the land of milk and honey. And then one day, of course, the daily data import fails. Information from the SQL error about precisely what failed is sparse. None of the developers can get the mighty join to work, and nobody can figure out what is wrong. The person from the helpdesk says that the server probably ran out of memory.

One of the weakling developers pulls out his penknife and whittles out an SSIS package. Of course, it iterates through the input table row by agonizing row. But here is the deal: it processes until it abends. It fails on the offending record and writes out a meaningful error message. After the problem is fixed, the package can be safely restarted.

Yes, it processes row by agonizing row. Yes, it is inelegant, in fact, it is patently dumb. Yes, it violates the DBA canon and deserves to burn in the pyre. And yet, the process starts, runs, finishes without complaint, and never comes to our attention again.

Follow the guidance of your DBA about RBAR. However, remember that he or she is from a different land. Always do what makes sense. This goes double if you have SMS job failure messages streaming to your cellphone.

Rate

3.4 (40)

You rated this post out of 5. Change rating

Share

Share

Rate

3.4 (40)

You rated this post out of 5. Change rating