March 3, 2006 at 9:16 am
Hello Mike,
this is a very good article, everything correctly describes issues I have to deal with almost every day when working with reports.
It would be a good idea if you would like to write a part 2 dealing with joins and nulls. The example may be when you have customers and orders and you would like to return all customers (or a count of customers) who did not place the orders. Sometimes these queries could be complex, especially when the specific order is required.
Regards,Yelena Varsha
March 3, 2006 at 10:16 am
Hi Steve and Yelena,
Thanks for the feedback! I'll look into writing an article on JOINs in the future. I'm actually working on a couple of other articles now, but I'll draft up a couple of ideas for later. My feeling is that you can't over-explain JOINs. Especially now that SQL 2005 kicks it up a notch with CTE's.
Thanks again
March 6, 2006 at 7:10 am
To expand a bit on the count(*) versus count(1) discussion, what does count(1) mean?
The fact that count(1) returns the same value as count(*) is an artifact of the parser and the query optimizer. In fact, count(0), count(2), and count(20) also return the same value on both of the SQL implementations I've tested this morning.
I suspect that the suggestion that count(1) is faster than count(*) came from a poorly constructed performance test on some version of some SQL implementation.
Keith W. Hare
March 6, 2006 at 7:34 am
The point I wanted to make was that "select count(x) from table" gives you the number of non-null values returned by the statement "select x from table". This applies when x is a column and also when x is something else (except *). So count(*) isn't so special after all, you may think of it as nothing more than count(1) (or count(0) if you prefer). I certainly don't claim it is faster...
I acknowledge that count(*) is part of the ansi syntax, as Mike also points out. I don't want to throw away count(*), only point out that SQL (on SQL Server 2000) is equally powerful without it.
March 7, 2006 at 7:47 am
The point I'm making is that COUNT(1) is not defined as standard syntax, COUNT(*) is. COUNT(1) [or COUNT(100), etc.] doesn't have a standard behavior defined, so that its behavior could change or even be removed in future versions of SQL Server... That being the case, I don't see COUNT(constant) as being powerful... I think it's probably more along the lines of a proprietary undocumented feature that should probably be avoided in favor of well-documented/standard features and syntax. I think "more powerful" implies that COUNT(constant) achieves something above and beyond what can be done with the standard ANSI COUNT(*) syntax... I don't see that as the case though...
March 7, 2006 at 1:07 pm
As I tried to say in my previous post, I think I understand your point.... I only wanted to say that on SQL Server 2000 you don't lose anything (except perhaps performance) by not using count(*) (the quote "more powerful" must be from somewhere else ). I think expressions such as count(1) or count('') or even count(case when column = 1 then null else 1 end) are pretty straightforward (but not necessarily very useful) generalizations of the count(column) syntax, and I would be very surprised if this behaviour changes in future versions of SQL Server (no guarantees, however
). Personally I stick with count(*) in queries but sometimes think of it as count(1). This could very well be because I have never read the ansi standard and have never worked with other database systems...
March 8, 2006 at 6:06 pm
Oops you're right, you said SQL was "just as powerful" without COUNT(*), not "more powerful". Sorry about that. Because COUNT(constant) doesn't have a guaranteed behavior, it's possible that ANSI could define a specific behavior for it in the future that differs from the MS implementation; or they could define that it should throw an exception, or any other number of strange things that could break code down the road.
On another topic, one of the items I should have mentioned in the article is COUNT(DISTINCT column), which is also defined by ANSI. This variation returns the count of DISTINCT rows for a particular column. Like COUNT(column), this one eliminates NULLs before returning the result. I think this variation tends to get less use than it's cousins COUNT(*) and COUNT(column), but it can be useful on occasion.
Thanks again.
September 6, 2006 at 12:01 pm
I ran into a snag on the SUM() section, especially the statement "we can rest assured that NULLs are eliminated from our result"
I tried to use SUM() as suggested but it returns NULLS.
I then tried to use a CASE statement like this (but it failed to eliminate NULLS as well):
SELECT SUM(CASE WHEN someField IS NULL THEN 0) AS SumOfSomeField FROM someTable
To get it to work I needed to change the CASE logic like this:
SELECT SUM(CASE WHEN someField IS NOT NULL THEN someField ELSE 0) AS SumOfSomeField FROM someTable
I can't understand why T-SQL behaves this way. But then again it is a Microsoft product so I'm not surprised
September 6, 2006 at 2:07 pm
The only time (that I know of) that SQL returns NULLs with SUM() is if there are no rows to SUM. This is the case if: 1) There are no rows in your table, and 2) If all the rows you are SUMming are NULL [in which case they are eliminated, and it's the same thing as SUMming no rows].
The SUM of no rows is not a known value. AFAIK that's standard ANSI procedure. The only other thing I can think of without seeing your table and your test data, is that you might have set some strange non-ANSI NULL feature.)
BTW, how does this work out for you:
SELECT SUM(COALESCE(somefield, 0)) FROM sometable
September 24, 2006 at 8:39 pm
Hi Mike ,
happen to stumble across these articles of yours..they are really informative & well explained.
thanks for making the technology a better tool to work with...
Priyaranjan Mohanty
September 26, 2006 at 7:10 pm
Thank you Priyaranjan, I appreciate the feedback!
March 27, 2008 at 11:39 pm
greate one , very useful for aggregate function,
thanks
April 4, 2008 at 4:33 am
Very good! Thank you.
I was waiting for someone to address the fact that nulls are grouped together by a group by clause -- a special case exception to the "null is never equal null" rule? (Not equal, just the same as...)
April 4, 2008 at 12:10 pm
April 5, 2008 at 4:32 pm
Mike,
I've started reading your article and am not clear why the query: SELECT COUNT(region) FROM Suppliers WHERE Region IS NOT NULL returns 0
1. WHERE clause eliminates every row in which region is not NULL.
Therefore, I only have 9 rows where Region IS Not Null
3. The COUNT(region) function eliminates every row in which region is NULL.
I would still expect 9 rows still to be returned back after the select query is parsed.
If you could please elaborate on why 0 rows are returned back from the query.
Viewing 15 posts - 16 through 30 (of 31 total)
You must be logged in to reply to this topic. Login to reply
This website stores cookies on your computer.
These cookies are used to improve your website experience and provide more personalized services to you, both on this website and through other media.
To find out more about the cookies we use, see our Privacy Policy