SQL Clone
SQLServerCentral is supported by Redgate
 
Log in  ::  Register  ::  Not logged in
 
 
 


The SQL * Wildcard


The SQL * Wildcard

Author
Message
namtaru
namtaru
Forum Newbie
Forum Newbie (6 reputation)Forum Newbie (6 reputation)Forum Newbie (6 reputation)Forum Newbie (6 reputation)Forum Newbie (6 reputation)Forum Newbie (6 reputation)Forum Newbie (6 reputation)Forum Newbie (6 reputation)

Group: General Forum Members
Points: 6 Visits: 14
So I was on a interview at Microsoft and they threw me up in front of a whiteboard. They described some tables they wanted me to illustrate and then Plain english asked me for data results, for which I wrote the queries to get the results they were after. It all went well and yay I got the job.

BUT
I wrote out one Query with a Select * to which the manager asked me, You know that is a inefficient way to call out the data from that table, a much better way is to list all the columns you wish to show, even if it is all of them. Do you know why?

My answer was " Well from a writing standpoint this way is faster, But i imagine by doing it with a * I am forcing a scan on the tables and not using indexes that could cover some of the Retrieval. He said I was partially right and that I should take this as homework. Well ive searched everywhere for the answer and either everyone is woefully as misinformed as me, or they all know it as common knowledge and I just dont. Anyone have a DEFINITAVE answer to this question?


Thanks a bunch in anticipation to your answers!
jsheldon
jsheldon
Ten Centuries
Ten Centuries (1.4K reputation)Ten Centuries (1.4K reputation)Ten Centuries (1.4K reputation)Ten Centuries (1.4K reputation)Ten Centuries (1.4K reputation)Ten Centuries (1.4K reputation)Ten Centuries (1.4K reputation)Ten Centuries (1.4K reputation)

Group: General Forum Members
Points: 1392 Visits: 425
I would be curious to the answer to this as well. I do know as knowledge that using Select * on large databases you are going to take a hit on performance.

I tried a test and did it on a small table with seven columns (i disagree that you would list all columns on a table, I wouldn't use a select * but select only the columns I needed)

I did a select * and a select list all columns

This table has a clustered index so when I first ran select * it built the clustered index scan, when I did the select list columns, the results were in milliseconds but the cost from the execution plan was the same Clustered index scan.

I guess the answer is 'depends on what type of indexes are applied to the table/views'????
Jeff Moden
Jeff Moden
SSC Guru
SSC Guru (214K reputation)SSC Guru (214K reputation)SSC Guru (214K reputation)SSC Guru (214K reputation)SSC Guru (214K reputation)SSC Guru (214K reputation)SSC Guru (214K reputation)SSC Guru (214K reputation)

Group: General Forum Members
Points: 214132 Visits: 41979
The best answer would be "to protect the results". I've seen it happen before... DBA's can decide there's a problem with a table or they need to instantiate some security measures (hide a column, whatever). So, they rename the table to something else, and create a view with the old table name... in their efforts, they change the order of the columns. Either the code goes BOOM or, worse yet, they put in an extra column of the same datatype as, say, your column "2" and suddenly you're selecting the wrong data. They may even change the order using the "surrogate view" just because some hot-dog thinks it should be in a different order.

Being a bit of a "SysOp from hell", it might be fun to do that once a week or so just to see whose code breaks Tongue

--Jeff Moden

RBAR is pronounced ree-bar and is a Modenism for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column.
If you think its expensive to hire a professional to do the job, wait until you hire an amateur. -- Red Adair

Helpful Links:
How to post code problems
How to post performance problems
Forum FAQs
namtaru
namtaru
Forum Newbie
Forum Newbie (6 reputation)Forum Newbie (6 reputation)Forum Newbie (6 reputation)Forum Newbie (6 reputation)Forum Newbie (6 reputation)Forum Newbie (6 reputation)Forum Newbie (6 reputation)Forum Newbie (6 reputation)

Group: General Forum Members
Points: 6 Visits: 14
I am wondering the more I think of it,that perhaps this is what he was trying to move me towards, Lets say:

Select *
From dbo.customers

Would that plan be cached? and be reusable? I am not sure it would be, where as if you selected all the columns. with a

Select c.fname,c.lname,c.contactnumber
From dbo.customers c

Would be cached, that sound right to you guys?
Jeff Moden
Jeff Moden
SSC Guru
SSC Guru (214K reputation)SSC Guru (214K reputation)SSC Guru (214K reputation)SSC Guru (214K reputation)SSC Guru (214K reputation)SSC Guru (214K reputation)SSC Guru (214K reputation)SSC Guru (214K reputation)

Group: General Forum Members
Points: 214132 Visits: 41979
Yep... maybe that too. Same thing goes with statistics, I suppose.

--Jeff Moden

RBAR is pronounced ree-bar and is a Modenism for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column.
If you think its expensive to hire a professional to do the job, wait until you hire an amateur. -- Red Adair

Helpful Links:
How to post code problems
How to post performance problems
Forum FAQs
Steve Jones
Steve Jones
SSC Guru
SSC Guru (145K reputation)SSC Guru (145K reputation)SSC Guru (145K reputation)SSC Guru (145K reputation)SSC Guru (145K reputation)SSC Guru (145K reputation)SSC Guru (145K reputation)SSC Guru (145K reputation)

Group: Administrators
Points: 145849 Visits: 19425
While I don't like SELECT * in code, I'm not sure it's less efficient. I'd like to know why. I always learned that it resolved the column list at compile time, which is why new columns don't show up in a view if you've used SELECT *.

I suppose that it might be slightly less efficient because of that, but that has to be noise compared to the query time. If this is between SELECT * and SELECT (all columns), I would like to know why it's less efficient.

Follow me on Twitter: @way0utwest
Forum Etiquette: How to post data/code on a forum to get the best help
My Blog: www.voiceofthedba.com
GilaMonster
GilaMonster
SSC Guru
SSC Guru (223K reputation)SSC Guru (223K reputation)SSC Guru (223K reputation)SSC Guru (223K reputation)SSC Guru (223K reputation)SSC Guru (223K reputation)SSC Guru (223K reputation)SSC Guru (223K reputation)

Group: General Forum Members
Points: 223338 Visits: 46297
namtaru (12/26/2007)

Select *
From dbo.customers

Would that plan be cached? and be reusable?


Yes and yes. It's a simple plan and there;'s no reason for it not to be cached and reused. You can check by looking in the plan cache


SELECT creation_time, last_execution_time, execution_count, dbid, objectid, text
FROM sys.dm_exec_query_stats qs
CROSS APPLY sys.dm_exec_sql_text(qs.sql_handle) st



The point with SELECT * is not that it's always less efficient, it's that it could result in more data been returned than necessary. It may be less efficient, depending what you're doing and what's in the table

Consider a table with 6 columns.
CREATE TABLE Test (
a int,
b varchar(10),
c datetime,
d datetime,
e varchar(500),
f smallint)

If you only want columns b and c and your use select *, you're returning, across a network, 4 columns that you don't need (between 14 and 514 bytes per row)

If there's an index on B include a and c, and you write a query
SELECT a,b,c from Test where B='Rob'. The index is covering for that query and the query can be satisfied with an index seek. No lookup to the cluster/heap is required.

Now consider SELECT * from Test where B='Rob'
An index seek can still be done, but now a lookup is required to the cluster/heap to retrieve columns d,e and f. More IOs, slower query.

Select * makes creating covering indexes near impossible to create (as all the columns in the table are required by the query)

Now, consider a modification is made to that table and column g is added, a varbinary(max) that holds an image averagng 700kb per row.
Now any query that does a SELECT * will start returning that image, even if it's not required. Since it's a LOB and stored out of row that means more IOs (lots more IOs), less efficient use of memory and a lot more data going over the network.

Worse, any query that has assumed that there are 6 columns in the table
( Like
insert into Table2 (Col1, col2, col3, Col4, Col5, Col6)
Select * From Test
)
will break when run.

Does that help?

jsheldon: The reason your second select ran faster than your first had nothing to do with the structure of the query. The first had to fetch the data from disk into cache, the second read from cache. If you're doing time tests like that, either ensure that the cache has the data before you start timing, or run DBCC DropCleanBuffers between runs to ensure an empty data cache

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass


Jeff Moden
Jeff Moden
SSC Guru
SSC Guru (214K reputation)SSC Guru (214K reputation)SSC Guru (214K reputation)SSC Guru (214K reputation)SSC Guru (214K reputation)SSC Guru (214K reputation)SSC Guru (214K reputation)SSC Guru (214K reputation)

Group: General Forum Members
Points: 214132 Visits: 41979
Understood... but the interviewer from Microsoft asked why it's better to list all of the columns than to use Select *...

--Jeff Moden

RBAR is pronounced ree-bar and is a Modenism for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column.
If you think its expensive to hire a professional to do the job, wait until you hire an amateur. -- Red Adair

Helpful Links:
How to post code problems
How to post performance problems
Forum FAQs
GilaMonster
GilaMonster
SSC Guru
SSC Guru (223K reputation)SSC Guru (223K reputation)SSC Guru (223K reputation)SSC Guru (223K reputation)SSC Guru (223K reputation)SSC Guru (223K reputation)SSC Guru (223K reputation)SSC Guru (223K reputation)

Group: General Forum Members
Points: 223338 Visits: 46297
Other than protecting against future changes, I can't think of a good reason.

Possibly SELECT * requires an extra lookup to the system tables to get the column names. Kinda like the sp_ double lookup.

Would be interesting to check, but I'm not quite sure how. Maybe profiler and the object Access events

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass


Jeff Moden
Jeff Moden
SSC Guru
SSC Guru (214K reputation)SSC Guru (214K reputation)SSC Guru (214K reputation)SSC Guru (214K reputation)SSC Guru (214K reputation)SSC Guru (214K reputation)SSC Guru (214K reputation)SSC Guru (214K reputation)

Group: General Forum Members
Points: 214132 Visits: 41979
Now that would be a good reason... can't see it making much of a difference in non-looping batch code, but I can see where it could make a heck of a difference in a high hit volume GUI or RBAR code.

--Jeff Moden

RBAR is pronounced ree-bar and is a Modenism for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column.
If you think its expensive to hire a professional to do the job, wait until you hire an amateur. -- Red Adair

Helpful Links:
How to post code problems
How to post performance problems
Forum FAQs
Go


Permissions

You can't post new topics.
You can't post topic replies.
You can't post new polls.
You can't post replies to polls.
You can't edit your own topics.
You can't delete your own topics.
You can't edit other topics.
You can't delete other topics.
You can't edit your own posts.
You can't edit other posts.
You can't delete your own posts.
You can't delete other posts.
You can't post events.
You can't edit your own events.
You can't edit other events.
You can't delete your own events.
You can't delete other events.
You can't send private messages.
You can't send emails.
You can read topics.
You can't vote in polls.
You can't upload attachments.
You can download attachments.
You can't post HTML code.
You can't edit HTML code.
You can't post IFCode.
You can't post JavaScript.
You can post emoticons.
You can't post or upload images.

Select a forum

































































































































































SQLServerCentral


Search