Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 

Two Best Practices!

By Darwin Hatheway,

As a DBA, one of the things that happens to me several times a day is finding a chunk of SQL in my inbox or, worse still, on a piece of paper dropped on my desk. Yes, it's SQL that performs poorly or doesn't do what the programmer expected and now I'm asked to look at it. And, it's often the case that this chunk of SQL is just plain ugly; hard to read and understand. There are two Best Practices that frequently get applied to such messes before I really start analyzing the problem…

BEST PRACTICE 1 - Use Mnemonic Table Aliases.

I found this chunk of SQL in a Sybase group today:

select distinct
a.clone_id,b.collection_name,a.source_clone_id,a.image_clone_id,c.library_name,c.vector_name,
c.host_name,d.plate,d.plate_row,d.plate_column,a.catalog_number,a.acclist,a.vendor_id,b.value,c.species,e.cluster
          from clone a,collection b,library c,location d, sequence e
         where a.collection_id = b.collection_id
           and a.library_id = c.source_lib_id
           and a.clone_id = d.clone_id
           and a.clone_id = e.clone_id
           and b.short_collection_type='cDNA'
           and b.is_public = 1
           and a.active = 1
           and a.no_sale = 0
           and e.cluster in (select cluster from master_xref_new where
type='CLONE' and id='LD10094')

I'm sure the news client has damaged the formatting of this a little bit but it's still obvious that the programmer didn't put any effort into making this SQL readable and easy to understand. And there it was in the newsgroups and he wanted us to read and understand it. Wonderful.

For me, the worst part of this query are the table aliases. A, B, C, D, E. I find that I must continually refer back to the "from" clause to try and remember what the heck A or E or whatever represents. Figuring out whether or not the programmer has gotten the relationships right is a real pain in the neck with this query. He's saved typing, sure, but at a tremendous cost in clarity. And I've had much worse end up on my desk: tables from A to P on at least one occasion and about three pages long, with some columns in the SELECT list that weren't qualified by table aliases at all.

Let's rewrite this guy's query for him using this first Best Practice (I'm not going to do anything about his spacing):

select distinct
clo.clone_id,clc.collection_name,clo.source_clone_id,clo.image_clone_id,lib.library_name,lib.vector_name,
lib.host_name,loc.plate,loc.plate_row,loc.plate_column,clo.catalog_number,clo.acclist,clo.vendor_id,clc.value,lib.species,seq.cluster
         from clone clo,collection clc,library lib,location loc, sequence seq
         where clo.collection_id = clc.collection_id
           and clo.library_id = lib.source_lib_id
           and clo.clone_id = loc.clone_id
           and clo.clone_id = seq.clone_id
           and clc.short_collection_type='cDNA'
           and clc.is_public = 1
           and clo.active = 1
           and clo.no_sale = 0
           and seq.cluster in (select cluster from master_xref_new where 
type='CLONE' and id='LD10094')

Without bothering to fix the spacing, isn't this already easier to understand? Which query lends itself to easier maintenance? Trust me, it's the latter, every time.

In some situations, being able to easily identify the source table for a column in the select list can be a big help, too. You may have two different tables which have fields with identical names but which mean different things. Catching those will be easier with mnemonics.

We can make another big improvement in this query with another best practice...

BEST PRACTICE 2 - Use ANSI JOIN Syntax

Do this to clearly demonstrate the separation between "How do we relate these tables to each other?" and "What rows do we care about in this particular query?"

In this case, I can only guess what the programmer is up to but, if I were a DBA at his site and knew the relationships between the tables, I could use this "relating" vs. "qualifying" dichotomy to help troubleshoot his queries. Let's rewrite this query again (but I'm still not going to do much about his spacing):

select distinct
clo.clone_id,clc.collection_name,clo.source_clone_id,clo.image_clone_id,lib.library_name,lib.vector_name,
lib.host_name,loc.plate,loc.plate_row,loc.plate_column,clo.catalog_number,clo.acclist,clo.vendor_id,clc.value,lib.species,seq.cluster
          from clone clo
inner join collection clc
on clo.collection_id = clc.collection_id
inner join library lib
on clo.library_id = lib.source_lib_id
inner join location loc
on clo.clone_id = loc.clone_id
inner join sequence seq
on clo.clone_id = seq.clone_id
         where clc.short_collection_type='cDNA'
           and clc.is_public = 1
           and clo.active = 1
           and clo.no_sale = 0
           and seq.cluster in (select cluster from master_xref_new where 
type='CLONE' and id='LD10094')

I still can't say for sure that this query is right. However, the DBA that does know this database is going to find it much easier to spot a missing element of the relationship between, say, collection and clone. It's certainly much easier to spot a situation where the programmer failed to include any relationship to one of the tables (it would be obvious to us at this point), so you get fewer accidental Cartesian Products.

In my experience, simply rewriting ugly queries according to these best practices has often pointed up the nature of the problem and made the solution a snap. This certainly happens often enough that taking the time to do the rewrite is worth the trouble.

Another advantage of following this rule is that it allows you to readily steal an important chunk of your SQL statements from any nearby statement that already relates these tables. Just grab the FROM clause out of another statement, put in the WHERE that's customized for this situation and you're ready, with some confidence, to run the query. Being a lazy sort, this feature is a real plus for me.

So, encourage mnemonic table aliases and use of ANSI JOIN syntax. As Red Green says: "I'm pullin' for ya. We're all in this together." He's right; your programmers might end up at my site or vice-versa someday.

Total article views: 10789 | Views in the last 30 days: 7
 
Related Articles
FORUM

Cloned server

Cloned Server

FORUM

Can I programmatically clone a stored procedure in tsql script?

Trying to make a cloning tool.

FORUM

Select query

Select query

FORUM

Select Smart query

Select Query

FORUM

select query

select query

Tags
 
Contribute

Join the most active online SQL Server Community

SQL knowledge, delivered daily, free:

Email address:  

You make SSC a better place

As a member of SQLServerCentral, you get free access to loads of fresh content: thousands of articles and SQL scripts, a library of free eBooks, a weekly database news roundup, a great Q & A platform… And it’s our huge, buzzing community of SQL Server Professionals that makes it such a success.

Join us!

Steve Jones
Editor, SQLServerCentral.com

Already a member? Jump in:

Email address:   Password:   Remember me: Forgotten your password?
Steve Jones