SQLServerCentral Article

Coding Standards Part 2 - Formatting

,

Coding Standards - Part 2 - Formatting SQL

This is a continuation of my series on coding standards. If you are interested, you can read the other articles, though you do not need to read it before this one.

Introduction

One of the areas that I feel is very important for managing and maintaining a stable environment is the use of standards. In any SQL Server environment, we deal with any number of objects or areas, but we often work with T-SQL code in our daily work. And as anyone know who has worked with others, having standards greatly improves and simplifies the ability of one person to take over for another and maintain some continuity between the individuals.

This article examines another area of standards that I have found to be helpful in a multi-person team: the formatting of code. I know that formatting is a question of style and I will probably get some arguments here, but having everyone maintain a similar format greatly speeds the ease with which one can examine and understand the code.

The Problem

How often do you get a sliver of code that is formatted as one long line of code? If you work with developers, the answer is probably quite often. Usually a developer formats T-SQL code in a manner that corresponds to the way in which they would format C++, ASP, VB, or any other code. Which may not be how you format the code.

How often do you then reformat it? I know that I have often been sent something like this:

select * from customers, orders, orderline where customers.orderid = orders.orderid
   and orders.orderid = orderline.orderid and orderline.amount > 100

Now this is a simple query, but when I am attempting to troubleshoot something, I like to have it in a format that makes life easier for me. Usually the first step for me would be to format this as follows:

select
*
 from customers, orders, orderline
 where customers.orderid = orders.orderid
 and orders.orderid = orderline.orderid
 and orderline.amount > 100

I would also then take further steps that would further clarify the joins that are occurring.

Why?

Because I understand the workings of a SQL statement by looking for a certain structure and verifying the joins are correct, no cross joins, proper columns, etc. And I often need to add and remove sections for testing, it helps me to quickly follow the logic when everything is arranged in a particular structure.

Setting a coding standard is something everyone should do and, IMHO, it is as important as setting naming standards. As with any standard, it simply makes it easier to work together and more efficiently when we understand each other's code.

I am not advocating you adopt my standards, though that would be nice. At least for me if I ever come to work for your company. However, I do urge that you adopt some standard and feel free to use these are a starting point.

My Solution

Here are a list of items that I have included in standards. These primarily apply to SELECT queries, but they can easily be adapted to INSERT, UPDATE or DELETE queries.

Tabs - Sounds like a minor item, but the formatting of anything starts with tabs. They can really help of hurt someone when editing or even viewing as some editors will reformat your display based on tabs. I set a standard of 3 spaces. Enough to see a difference and still fit fairly long statements on my screen completely.

Aliases - I pretty much always use an alias. I used to do the 'a', 'b', etc. thing, essentially numbering each table sequentially as it appeared in the query. I changed that to adopt a "standard" alias for each table in a database. These are usually 1 or 2 character abbreviations, sometimes 3, that I use over and over. While it seems cumbersome, I had a database with 300+ tables and remembering the aliases was pretty easy for both me and my development team. Some examples:

  • Orders - o
  • OrderLineItems - oi
  • Customers - c
  • CustomerSuport - cs

I'm sure this will be easy for you and it also makes the reading of a query much cleaner. I hate wading through joins that look like:

   and customersupport.phonenumber = customeraddress.phonenumber

It's much easier to see:

   and cs.phonenumber = ca.phonenumber

Spacing - I try to encourage spacing and the use of new lines for formatting queries. I break each section into a new line (see the next item) as well as indent it. I use tabs and spaces judiciously to make each area easier to read. my general rule is to indent using tabs, except for the major section of a query. Each of those is one space down from the query type (insert, select, etc.). An example looks like:

select
*
 from Customers c
 where c.customerid = 5
 and c.customeraddress is not null

I also include spaces around each parenthesis and comma and operator. These make the CTRL- controls handy for moving inside the query. Nothing worse then being unable to get to the > in a comparison because the column names are slammed up against the operator.

Column names - I've gone back and forth on this, but overall I've found that having each column on a separate line to be extremely convenient for commenting them in and out of queries. I also precede columns with a comma, rather than having it follow the column name for the same reason. I often find this the best technique for debugging. Formatting in this way also allows me to easily cut and paste from the column list to the GROUP BY or ORDER BY sections.

I also indent the join syntax by one tab and the ON clause by one further tab. These help with the readability of the query. Example:

select
customerid
, customername
, address
, phone
, status
 from Customers c
inner join orders o
on c.customerid = o.customerid
left outer join orderlines oi
on o.orderid = oi.orderid
and oi.status = 1
 where c.customerid = 5
 and c.customeraddress is not null
 order by
customerid
, customername
, status

Note that I cut and pasted the column list to the order by and then removed a couple columns be selecting the entire line. A small thing, but it makes my typing seem so much faster and efficient.

Sections - Each query can be broken up into sections based on the major keywords. I move each of these to a new line to make the reading easy as well as the navigation convenient. It makes more sense when looking at it, so here is an example:

select
customerid
, customername
, count(*)
 from Customers c
inner join orders o
on c.customerid = o.customerid
left outer join orderlines oi
on o.orderid = oi.orderid
and oi.status = 1
 where customerid = 5
 and c.customeraddress is not null
 and (c.active = 1
OR
c.status > 5
)
 group by
customerid
, customername
 having
count(*) > 1
 order by
customerid
, customername

A few other items worth mentioning. I include special operators (TOP, DISTINCT, etc.) on the first line with SELECT. This keeps me from confusing them as a column of some sort. I also indent subselects in a similar manner, but starting from the point of the parenthesis. I then match up the closing parenthesis with the opening one on its own line as in the last example above.

Conclusions

I know this isn't a complete list, it is merely intended to handle the formatting of SQL. Even there, I'm sure there are things I've left out. As I wrote this I realized how much is involved and how confusing it can be to explain.

Since I am most often working with SQL, this is a high priority area. There are also standards for formatting the various object statements, including comments, etc. I'll be tackling those in another article later on.

I'd be interested to hear what other standards, if any, people have implemented and how they are working. I suspect this is often more of a "my own style" area, but I have learned to adapt to others, and I have found that having a standard truly makes work easier.

As always I welcome feedback on this article using the "Your Opinion" button below.

Steve Jones

©dkRanch.net June 2002

Rate

5 (2)

You rated this post out of 5. Change rating

Share

Share

Rate

5 (2)

You rated this post out of 5. Change rating