Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 

Star Struck

Have you ever run into an error and been puzzled as to why that error occurred?

Recently I have been working on a project to backfill a development and QA environment.  These environments are essential in any database environment to help ensure the production database will continue to work after changes have been made.  These environments existed once upon a time but had been sorely neglected and process was not followed.

For this process, I am using some software to help generate diff scripts between the environments.  Once the script has been generated I will execute that script in the appropriate environment (DEV or QA).  I am doing this process for all tables, procs, views, and functions.  This is also done iteratively database by database.

All-Stars

While performing this routine, occasionally an error might be encountered due to an order of processing that may need to be performed.  In cases such as that, I would typically alter my process to account for this error.  Usually that is not an issue if the tables are first created, then the views, and then the procs and functions.  Other times, the create/alter script must be evaluated to determine what is causing the failure.

One of these failures that caused me a lot of angst was producing the following error in  a few stored procedures.

Msg 205, Level 16, State 1, Line 4
All queries combined using a UNION, INTERSECT or EXCEPT operator must have an equal number of expressions in their target lists.

That error message is fair enough to evaluate.  Based on the error one should expect that the stored procedure contains one of those keywords.  I picked a stored procedure out of the mix that was generating this error and checked the code for it.  Looking through the entire proc, there was no such instance of those keywords.  This was temporarily puzzling.  I continued to investigate and found that there was a view being used in that stored procedure.  I decided to double-check the view and see if it might be the cause.  Inside the view I see something like the following:

SELECT *
FROM dbo.Table1 WITH (nolock)
WHERE AccrueThruDate between '2010/01/31' and '2010/12/31'
UNION ALL
SELECT *
FROM dbo.Table2 WITH (nolock)
WHERE AccrueThruDate between '2010/01/31' and '2010/12/31'
UNION ALL
SELECT *
FROM Db2.dbo.Table1 WITH (nolock)

At first sight, I see a few things I don’t like but didn’t immediately pick up on the table in the third select.  I saw the table name and moved on thinking it must be ok.  Then I decided to sanity check more precisely what was there and saw that it was pointing to the same table name in a different database.  Low and behold in the other database the table did not have the same number of columns.  These tables are all pretty wide at 100+ columns each.  The designer thought that every single column from each of these tables was needed and decided to take a shortcut.  Had the columns been explicitly named, I would have gotten a more useful error message, in this case, such as the following:

Msg 207, Level 16, State 1, Line 3
Invalid column name 'Column102'.

That little change in message could save time when troubleshooting a problem such as what I have described.  Granted, it does not tell me which database or table but at least I know which column is missing.  This is one little reason why I like to explicitly name the columns in my Select statements.

Another reason why I like to use the column names instead of a ” * ” is that I have seen queries improve dramatically in performance speed when compared.  In this little case, even though there are 100+ columns from each table to list out and all of the data would be returned from each of these, I see a 700 ms improvement in return time on just the “top 1000″ records.  When I bump that up to the full result set required by the view, I tire of waiting for the results from either query.  Both are terrificly slow (takes longer than 15 minutes to return the full result set).

But But But…
So why do people use Select * in their code?  It is a lot easier to type than having to type all of the column names.  Because it is fewer keystrokes it would also save time when writing code.  Many times, we may also use it when trying to get a glimpse of the system really quick.  It is also oftentimes used when doing a demo.

Should it be used?  There are cases for using that kind of code such as in an “if exists” block.  There are cases where the performance impact is minimal or just as good as listing the columns so it is fine.  I use it when I am trying to do a quick and dirty statement while troubleshooting.  I don’t like to put that into code to be consumed by applications or reused.  I like to know exactly what columns are being returned by my queries and I don’t feel it necessary to return more data than necessary (and consequently consume more resources).

What are some cases you have found a select * to be better than explicitly naming the columns?

Comments

Posted by Steve Jones on 17 August 2010

I was trying to figure out what was wrong in your code and missed the database thing as well. A nice little story to remind us to check the simple things.

Posted by Jason Brimhall on 18 August 2010

Thanks Steve.

Posted by tstaker on 18 August 2010

Nice blog post.

Typing out the field names instead of * is a pain and totally see why others do it.  I broke this habit many years ago and have been pleased with a trick I learned with SSMS.  If you expand out the object explorer in SSMS down to the columns folder you can click and drag the folder to the query window and all the columns will appear in your query window.

Posted by Jason Brimhall on 18 August 2010

Excellent trick - thanks.

Posted by Beth Ludwig on 23 August 2010

Love the new trick ! I'll be using it lots & will share with the rest of the dept.

Posted by jason on 24 August 2010

I never knew that about dragging the Columns folder - Nice tip!

Posted by Ron Ronhovde on 24 August 2010

I like the idea and it works on table names ALSO.  I am forever mistyping names.  Now I won't. :-)

Posted by bwillsie-842793 on 27 August 2010

I set the default "Value for Select Top <n> Rows command" to 1 in SMS, then find the whatever table I want to create a query for in the object explorer, right click on it, and choose the "Select Top 1 Rows" option.

This creates a generic query that includes all the fields, already formatted so it is easy on the eyes.

I then trim or comment out extraneous fields, and add any conditional code as required.  I'll cut and paste fields as appropriate to define the conditional code.

Posted by Yitzchok Lavi on 29 August 2010

Jason wrote: There are cases for using that kind of code such as in an "if exists" block.

But even there, you can avoid using "select *" and write "select 1" (or any other constant you like) instead.

Even if it makes no difference to the optimizer, I prefer to read that because it broadcasts that the only fields that matter are the ones in the "where" clause.

Leave a Comment

Please register or log in to leave a comment.