December 17, 2004 at 7:31 pm
Hi there,
I am using the SUM function to total values in a single column.
The SUM function ignores NULL values, and continues totalling the column. However I need to return a NULL value as the SUM total if the column contains any row NULL values.
Is there a way to do this?
Thanks,
Mark
December 18, 2004 at 10:40 pm
Is this what you mean?
1. Using a table like this:
---------------------------
CREATE TABLE [dbo].[ab_sum_nulls] (
[sum_iid] [int] IDENTITY (1, 1) NOT NULL ,
[item_grp] [char] (10) ,
[item_value] [int] NULL
) ON [PRIMARY]
GO
2. With data like this:
SELECT item_grp, item_value
FROM ab_sum_nulls
------------------
item_grp item_value
A NULL
A 10
A 12
B 0
B 10
B 12
3. The regular SQL sum looks like this:
SELECT item_grp, SUM(item_value) AS raw_sum,
SUM(CASE WHEN item_value IS NULL THEN 1
ELSE 0 END) AS null_ct
FROM ab_sum_nulls
GROUP BY item_grp
--------------------
item_grp raw_sum null_ct
A 22 1
B 22 0
4. Then this gives you what you want:
SELECT item_grp,
CASE WHEN null_ct = 0 THEN raw_sum
ELSE NULL END AS item_sum
FROM (
SELECT item_grp, SUM(item_value) AS raw_sum,
SUM(CASE WHEN item_value IS NULL THEN 1
ELSE 0 END) AS null_ct
FROM ab_sum_nulls
GROUP BY item_grp
) AS sum_qry
--------------
item_grp item_sum
A NULL
B 22
Bob Monahon
December 19, 2004 at 12:51 am
Hi Bob,
Great script! That is exactly what I needed.
I am actually completing the sum as part of a cross table or pivot selection. Using:
SELECT SUM(CASE item_grp WHEN 'A' THEN item_value ELSE 0 END) AS 'Total A',
SUM(CASE item_grp WHEN 'B' THEN item_value ELSE 0 END) AS 'Total B'
FROM ab_sum_nulls
I get:
Total A Total B
----------- -----------
22 22
(1 row(s) affected)
I have combined this with your script:
SELECT SUM(CASE WHEN null_ct = 0 AND item_grp = 'A' THEN raw_sum ELSE NULL END) AS 'Total A',
SUM(CASE WHEN null_ct = 0 AND item_grp = 'B' THEN raw_sum ELSE NULL END) AS 'Total B'
FROM (
SELECT item_grp, SUM(item_value) AS raw_sum, SUM(CASE WHEN item_value IS NULL THEN 1 ELSE 0 END) AS null_ct
FROM ab_sum_nulls
GROUP BY item_grp
  AS sum_qry
to get:
Total A Total B
----------- -----------
NULL 22
(1 row(s) affected)
Do you think there would be a large performance hit running this query on a table of about 1 million rows?
Thanks again for you help.
Mark
December 19, 2004 at 2:18 am
Here is another way, without using a subquery:
SELECT item_grp, CASE WHEN COUNT(item_value)<COUNT(*) THEN NULL ELSE SUM(item_value) END AS raw_sum FROM ab_sum_nulls GROUP BY item_grp
And here is another one, without using a CASE:
SELECT item_grp, NULLIF(1-SIGN(COUNT(*)-COUNT(item_value)),0) *SUM(item_value) AS raw_sum FROM ab_sum_nulls GROUP BY item_grp
(I have to agree, the last query is very cryptic... I wrote it just for fun, trying to get the shortest query that does the job)
All these queries cause a "Warning: Null value is eliminated by an aggregate or other SET operation." If this annoys you, you can use the following query
SELECT item_grp, CASE WHEN SUM(CASE WHEN item_value IS NULL THEN 1 ELSE 0 END)=0 THEN NULL ELSE SUM(ISNULL(item_value,0)) END AS raw_sum FROM ab_sum_nulls GROUP BY item_grp
Regarding the performance, all these queries should have a very similar execution plan: all of them require one table scan. If the table in question has 1 million rows... this would take some time, of course (1 to 3 seconds, maybe).
Razvan
December 19, 2004 at 11:04 am
Hello Mark,
Re: Do you think there would be a large performance hit running this query on a table of about 1 million rows?
Probably not. I ran the query with SHOWPLAN_ALL, and it appears to run a single pass through the data (your million rows), with possibly a second pass through an intermediate result set (the summary by [item_grp]).
I liked Razvan's suggestions; so I compared queries. Here's yours:
SET SHOWPLAN_ALL ON
GO
SELECT SUM(CASE WHEN null_ct = 0 AND item_grp = 'A'
THEN raw_sum ELSE NULL END) AS 'Total A',
SUM(CASE WHEN null_ct = 0 AND item_grp = 'B'
THEN raw_sum ELSE NULL END) AS 'Total B'
FROM (
SELECT item_grp, SUM(item_value) AS raw_sum
, SUM(CASE WHEN item_value IS NULL
THEN 1 ELSE 0 END) AS null_ct
FROM ab_sum_nulls
GROUP BY item_grp
) AS sum_qry
GO
And here's a version based on Razvan's ideas to eliminate the "SUM(CASE" in the inner query:
SET SHOWPLAN_ALL ON
GO
SELECT SUM(CASE WHEN item_ct = row_ct AND item_grp = 'A'
THEN raw_sum ELSE NULL END) AS 'Total A',
SUM(CASE WHEN item_ct = row_ct AND item_grp = 'B'
THEN raw_sum ELSE NULL END) AS 'Total B'
FROM (
SELECT item_grp, SUM(item_value) AS raw_sum
-- Next line causes Warning msg:
-- Null value is eliminated by an aggregate...
, count(item_value) AS item_ct
, count(*) as row_ct
FROM ab_sum_nulls
GROUP BY item_grp
) AS sum_qry
GO
The plans were almost identical, and estimaged CPU and IO were exactly the same. It appears that when you write "COUNT(item_value)", SQL Server actually executes something like this: SUM(CASE WHEN item_value IS NULL THEN 0 ELSE 1). Huh!
The query you're using actually looks to perform a bit better, because the avg. row size is smaller on some of the showplan steps; so for millions of rows you might see a benefit.
Bob Monahon
December 19, 2004 at 10:55 pm
Thanks for you help guys - brilliant!
Mark
Viewing 6 posts - 1 through 6 (of 6 total)
You must be logged in to reply to this topic. Login to reply
This website stores cookies on your computer.
These cookies are used to improve your website experience and provide more personalized services to you, both on this website and through other media.
To find out more about the cookies we use, see our Privacy Policy