Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 
        
Home       Members    Calendar    Who's On


Add to briefcase 123»»»

EXISTS Subqueries: SELECT 1 vs. SELECT * (Database Weekly 11.02.2008) Expand / Collapse
Author
Message
Posted Monday, February 11, 2008 12:59 AM
SSC-Addicted

SSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-Addicted

Group: General Forum Members
Last Login: Today @ 6:23 AM
Points: 449, Visits: 1,856
Hello,

I just read through my latest email from SSC (Weekly Update 11.02.08) and saw an interesting subject on the if exists subquery.

The link in the mail pointed to a blog : http://www.sqlskills.com/blogs/conor/2008/02/07/EXISTSSubqueriesSELECT1VsSELECT.aspx

The blog says that it is better (quicker) to do a select 1 instead of select * (no lookup of the metadata with select 1).

However, I remember reading somewhere on these forums that the performance difference is so negligible that it can be ignored (topic was about select * or select [col1],[col1]...)

My question is: is select 1 better in the instance than select *?

I have seen both in use in my company and would like to move people to one version or the other (sort of common code practice).

I would appreciate your views on this.

Thanks in advance!

GermanDBA



Regards,

WilliamD
Post #453737
Posted Monday, February 11, 2008 2:24 PM


SSCoach

SSCoachSSCoachSSCoachSSCoachSSCoachSSCoachSSCoachSSCoachSSCoachSSCoachSSCoach

Group: General Forum Members
Last Login: Friday, June 27, 2014 12:43 PM
Points: 15,444, Visits: 9,596
The advantage of "Select Col1, Col2...." (all columns explicitly named), vs "Select *", is more than just performance.

First, if you just need a few columns, not all of them, it means you pull less data. That means less I/O, less RAM, less CPU, etc. This won't matter much in a small and/or low-traffic database, or one where you have a high bandwidth connection. It will matter more if you are pulling large amounts of data across a lower bandwidth network.

For example, if you pull 2 Int columns, and 1 varchar(50) column, for 1-million records, you're looking at c. 58 Meg of data. If you don't need the varchar column, you instead pull 8 Meg of data. On a Gigabit connection on a server that's bored, you'll do just fine with the bigger pull, but across the Internet, you'll definitely notice it.

The main advantage, however, is in future development on the database. If columns are added to the table, or renamed, an implicit select can be broken.

For example, if you change the name of a column from "FName" to "FirstName", on a table, you can look in the system views for all procs, views, etc., that reference FName and update them accordingly, but it becomes much harder to identify procs, etc., that implicitly reference that column with a "*". Let's say you have five procs that use that column, and 175 that reference that table. If all 175 use "Select *", you have to go into each of them, figure out which ones actually do anything with that column, and then update them. If you use explicit selection, you go into the five that actually need updating, and fix those. The rest you can ignore. That means typing "Select *" (which is usually done to save a few seconds of typing column names), can cost you hours of work later on.


- Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
Property of The Thread

"Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon
Post #454080
Posted Monday, February 11, 2008 2:28 PM


SSCoach

SSCoachSSCoachSSCoachSSCoachSSCoachSSCoachSSCoachSSCoachSSCoachSSCoachSSCoach

Group: General Forum Members
Last Login: Friday, June 27, 2014 12:43 PM
Points: 15,444, Visits: 9,596
On the specific point of "if exists", the reason to use "*" in that case is so SQL server can pick which index it wants to use, instead of putting in a specific column name and thus possibly forcing it to use a specific index. It's a minor speed advantage.

Using "select 1" means SQL Server can use whatever index it wants, and doesn't have to look up column names. Again, a very minor improvement, but every bit counts in some busy servers.


- Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
Property of The Thread

"Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon
Post #454082
Posted Monday, February 11, 2008 6:04 PM


SSCommitted

SSCommittedSSCommittedSSCommittedSSCommittedSSCommittedSSCommittedSSCommittedSSCommitted

Group: General Forum Members
Last Login: Monday, October 25, 2010 6:09 AM
Points: 1,621, Visits: 409
Hi,
either you can use columns or * or 1 with EXISTS clause.
The EXISTS clause is based on TRUE or FALSE. it doesn't depend on columns. If you use column names or * then server has to identify what are all the columns that you want to retriev and where they are comming from and whther you have prvileges on those columns are not. so obviously it creates some bit of performance issues.
so go with 1 where ever you have EXISTS clause.
Thanks -- Vj
http://dotnetvj.blogspot.com


Thanks -- Vijaya Kadiyala
www.dotnetvj.com
SQL Server Articles For Beginers



Post #454166
Posted Tuesday, February 12, 2008 1:08 AM
SSC-Addicted

SSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-Addicted

Group: General Forum Members
Last Login: Today @ 6:23 AM
Points: 449, Visits: 1,856
Hi,

The point about which index is used seems a bit superfluous to me because the remaining part of the select is what makes the server choose the index.

If I perform:
 select 1 from table where col1 = value 

and col1 has no index then there is only the benefit of not having to find the columns of "table" by using select 1, right?

I would only agree with the statement that select 1 would choose the best index, if col1 was in more than one index on table. The server would then look which index is best used for the "where" part of the statement and then use that.

So the performance winner with select 1 is that the server doesn't take the hit for column checks.

Am I barking up the wrong tree here or does it sound sane to you guys?

Thanks

GermanDBA



Regards,

WilliamD
Post #454248
Posted Tuesday, February 12, 2008 1:15 AM


SSC-Forever

SSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-Forever

Group: General Forum Members
Last Login: Yesterday @ 4:54 PM
Points: 42,434, Visits: 35,488
GermanDBA (2/12/2008)

So the performance winner with select 1 is that the server doesn't take the hit for column checks.


That's what I understood from Conor's post. It's probably a very, very minimal hit. Just bear in mind that this is only for Exists subqueries.




Gail Shaw
Microsoft Certified Master: SQL Server 2008, MVP
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

Post #454250
Posted Tuesday, February 12, 2008 8:21 AM


SSC-Enthusiastic

SSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-Enthusiastic

Group: General Forum Members
Last Login: Friday, June 20, 2014 2:22 PM
Points: 190, Visits: 436
In MSSQL Server Management Stdio with "Include Actual Execution Plan" turned on I ran this using a table with lots of rows and a WHERE clause with low specificity...

dbcc dropcleanbuffers;
set statistics io on
Select 1 where exists(select * from dbo.WordIndex where Term='inc')
set statistics io off
Go
dbcc dropcleanbuffers;
set statistics io on
select * from dbo.WordIndex where Term='inc'
set statistics io off

In the "Execution Plan" tab: Notice...
the rellative query costs of both batches
the estimated number of rows vs actual number of rows of the two seek constructs

In the "Messages" tab: Notice...
the difference in table IO stats between the two batches

Clearly SQL Server is optimizing the EXISTS subquery. Try this with different predicates to uncover more info...





PeteK
I have CDO. It's like OCD but all the letters are in alphabetical order... as they should be.
Post #454456
Posted Wednesday, February 13, 2008 8:08 AM


Ten Centuries

Ten CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen Centuries

Group: General Forum Members
Last Login: 2 days ago @ 2:32 PM
Points: 1,182, Visits: 1,969
Better syntax:

Use NULL vs. a literal such as 1 or 'x' as in:

SELECT NULL FROM ...

Sample code using EXISTS that checks for the proper installation of the full-text filters for various document types:
-- Ensure that the IFilter interfaces (filters) are enabled for ALL of the document types supported by the product.
-- Abort if they are not.
IF NOT EXISTS
(
SELECT NULL
WHERE EXISTS (SELECT NULL FROM sys.fulltext_document_types WHERE document_type = N'.doc')
AND EXISTS (SELECT NULL FROM sys.fulltext_document_types WHERE document_type = N'.ppt')
AND EXISTS (SELECT NULL FROM sys.fulltext_document_types WHERE document_type = N'.pdf')
)
BEGIN;
RAISERROR('The required IFilters are not enabled or installed - aborting', 16, 1);
END;
GO




(PHB) I think we should build an SQL database. (Dilbert) What color do you want that database? (PHB) I think mauve has the most RAM.
Post #455097
Posted Wednesday, February 13, 2008 9:10 AM


SSC-Forever

SSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-Forever

Group: General Forum Members
Last Login: Yesterday @ 4:54 PM
Points: 42,434, Visits: 35,488
There's a massive difference between the statements
Select 1 where exists(select * from dbo.WordIndex where Term='inc')
and
select * from dbo.WordIndex where Term='inc'

Exists is only checking for the existence of a row. It can use whatever indexes are appropriate based on the where clause and return as soon as a row is found. It doesn't need to retrieve any actual values from the table in question
The select * , however has to retrieve all the columns that match the where clause. It also, if it was using a NC index, has to lookup the rest of the columns in the table from the clustered index.

All Conor was talking about was the difference between
IF EXISTS (SELECT 1 FROM SomeTable WHERE SomeCondition)
and
IF EXISTS (SELECT * FROM SomeTable WHERE SomeCondition)



Gail Shaw
Microsoft Certified Master: SQL Server 2008, MVP
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

Post #455154
Posted Wednesday, February 13, 2008 9:36 AM


Ten Centuries

Ten CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen Centuries

Group: General Forum Members
Last Login: 2 days ago @ 2:32 PM
Points: 1,182, Visits: 1,969
All Conor was talking about was the difference between
IF EXISTS (SELECT 1 FROM SomeTable WHERE SomeCondition)
and
IF EXISTS (SELECT * FROM SomeTable WHERE SomeCondition)


True. The use of *, as noted, is not good for the existance test subquery. Often, you see SELECT 1 as noted, or SELECT 'x', or something equiv. I'm just stating that you can use NULL as in:
IF EXISTS (SELECT NULL FROM SomeTable WHERE SomeCondition)




(PHB) I think we should build an SQL database. (Dilbert) What color do you want that database? (PHB) I think mauve has the most RAM.
Post #455175
« Prev Topic | Next Topic »

Add to briefcase 123»»»

Permissions Expand / Collapse