Aggregate Function Product()

Question

Post reply

Aggregate Function Product()

pmcpherson

SSC-Addicted

Points: 477
More actions
February 5, 2012 at 10:02 pm

#252486

Comments posted to this topic are about the item Aggregate Function Product()

Viewing 15 posts - 1 through 15 (of 19 total)

You must be logged in to reply to this topic. Login to reply

SwePeso SSC-Dedicated Points: 39758 More actions · Answer 1

I hate to be the party pooper but change one value to a negative value and try again.

Also see http://weblogs.sqlteam.com/peterl/archive/2008/11/19/How-to-get-the-productsum-from-a-table.aspx

N 56°04'39.16"
E 12°55'05.25"

Mike McIver SSC-Addicted Points: 493 More actions · Answer 2

Maybe this to account for signs . . .

EXP(SUM(LOG(ABS([RowValue])))) * SIGN(CHECKSUM_AGG([RowValue]))

Mike
MacroTrenz.Com
LinkedIn

Hugo Kornelis SSC Guru Points: 64790 More actions · Answer 3

The method in the article is only good for positive (>0) values. Here is a formula that works for zeroes and negatives as well:

COALESCE(EXP(SUM(LOG(ABS(NULLIF(val, 0))))),0) * (1 - 2 * (COUNT(CASE WHEN val < 0 THEN 1 END) % 2))

The first part uses NULLIF to replace 0 with NULL, ABS to change negative values to positive ones, then pours that into the log-based product formula (except that this one uses natural logarithms instead of 10-based logarithms). Because of NULL propagation, a single 0 in the input set (which is converted to NULL by NULLIF) will cause the result to be NULL; the COALESCE function fixes this by replacing NULL with 0. The result of this first part is the product of the absolute value of all numbers.

The second part of the formula counts the number of negative values in the input set, then calculates the remainder after division by 2 - resulting in 0 for an even number of negatives, and 1 for an odd number. This value is multiplied by 2 and subtracted from 1, resulting in 1 if there's an even number of negatives, and -1 if there's an odd number. Multiply this by the product of the absolute values to get the final answer.

I first saw this technique in a book by Itzik Ben-Gan (who else?). I now had to google for it, and found my first hit in a comment by Rob Farley to this blog post by Michael Coles - though I did change and simplify Rob's version a bit - he had omitted the COALESCE and instead used a count of the number of zero values - a technique that I don't understand and that, frankly, probably doesn't work.

One problem with this approach is that there is no way to distinguish a NULL that was introduced by the ISNULL (and hence was a 0 at first) from a NULL that was already NULL in the input set, so the result will be 0 if one of the input values is NULL, whereas the proper result would be either NULL (if you want to include the NULL input in the calculation), or the product of the non-NULL values (if you want the aggregate to work like all other aggregates do). I guess the best way to fix this would be to exclude NULL values in the WHERE clause, or to add some extra CASE expressions.

I als found a more readable version that uses the same technique, but broken down into steps, in a reply to this post on stackoverflow. This reply was posted by gbn.

SELECT

GrpID,

CASE

WHEN MinVal = 0 THEN 0

WHEN Neg % 2 = 1 THEN -1 * EXP(ABSMult)

ELSE EXP(ABSMult)

END

FROM

(

SELECT

GrpID,

--log of +ve row values

SUM(LOG(ABS(NULLIF(Value, 0)))) AS ABSMult,

--count of -ve values. Even = +ve result.

SUM(SIGN(CASE WHEN Value < 0 THEN 1 ELSE 0 END)) AS Neg,

--anything * zero = zero

MIN(ABS(Value)) AS MinVal

FROM

Mytable

GROUP BY

GrpID

) foo

This version is more readable, but has the same problem with NULL versions as the compact version above.

Hugo Kornelis, SQL Server/Data Platform MVP (2006-2016)
Visit my SQL Server blog: https://sqlserverfast.com/blog/
SQL Server Execution Plan Reference: https://sqlserverfast.com/epr/

Hugo Kornelis SSC Guru Points: 64790 More actions · Answer 4

I just saw the answer by Mike. I like the more compact way to get the multiplication factor (CHECKSUM_AGG), but I'm also a bit concerned - I could not find any documentation (at least not in the 30 seconds I spent searching -I know, pathetic!-) that this aggregate guarantees a negative result if the number of negative input values is odd, and a positive result otherwise. Without such documentation, this method appears to be less safe.

And a final comment regarding the original article that I forgot in my previous post:

"Notice the decimal point after the first argument of the power function. You need it to force a precession of 18 so that 4 * 2 equal 8 instead of 7."

The decimal point does not force a precision of 18, it forces the use of non-integer (numeric(5,2) to be precise). Without this, all numbers after the decimal place are cut off. That's why most versions of this algorithm on the internet use natural logarithm rather than 10-based logarithm - that one always uses non-integer calculation.

Hugo Kornelis, SQL Server/Data Platform MVP (2006-2016)
Visit my SQL Server blog: https://sqlserverfast.com/blog/
SQL Server Execution Plan Reference: https://sqlserverfast.com/epr/

Paul White SSC Guru Points: 150468 More actions · Answer 5

We use a CLR aggregate for this.

Paul White
All articles available on SQL.kiwi
@SQL_Kiwi

archie flockhart SSCrazy Points: 2339 More actions · Answer 6

I'd be interested to know what problems require the calculation of a product from each row in a query in this way.

Presumably also, you'd need to be aware of the number of values and size of values that are expected - it wouldn't take very many numbers multiplied together for the product to get very large. (Or very small, if the numbers being multiplied are between -1 and 1 )

David.Poole SSC Guru Points: 76355 More actions · Answer 7

archie flockhart (2/6/2012)
I'd be interested to know what problems require the calculation of a product from each row in a query in this way.

Calculating correlation statistics

LinkedIn Profile

archie flockhart SSCrazy Points: 2339 More actions · Answer 8

David.Poole (2/6/2012)
archie flockhart (2/6/2012)
I'd be interested to know what problems require the calculation of a product from each row in a query in this way.
Calculating correlation statistics

It's been a long time since I did any correlations but I can't remember needing to get a product of all the numbers in a dataset, and a quick google for the formulas didn't throw up anything that required Product( X₁ .. X_n)

Can you point me at more details ?

Hugo Kornelis SSC Guru Points: 64790 More actions · Answer 9

archie flockhart (2/6/2012)
Presumably also, you'd need to be aware of the number of values and size of values that are expected - it wouldn't take very many numbers multiplied together for the product to get very large. (Or very small, if the numbers being multiplied are between -1 and 1 )

If I ever need to do this for real, I'll amost certainly add a CAST to float, to make sure that results of virtually all sizes can be represented.

Hugo Kornelis, SQL Server/Data Platform MVP (2006-2016)
Visit my SQL Server blog: https://sqlserverfast.com/blog/
SQL Server Execution Plan Reference: https://sqlserverfast.com/epr/

Stephen.Rice Grasshopper Points: 24 More actions · Answer 10

And another way to deal with 0 and negatives it if you don't want to use a 'case' statement :

MIN(ABS(SIGN(num))) * (1-(SUM(SIGN(num) * (SIGN(num) -1))%4)) * POWER(10.,SUM(LOG10(ABS(Num) + 1 - ABS(SIGN(num)))))

Hugo Kornelis SSC Guru Points: 64790 More actions · Answer 11

SQL Kiwi (2/6/2012)
We use a CLR aggregate for this.

That's definitely cleaner, and easier to understand code. But how does the performance of a CLR aggregate compare to performance of a complicated formula using native functions only? Have you ever done any perf testing?

Hugo Kornelis, SQL Server/Data Platform MVP (2006-2016)
Visit my SQL Server blog: https://sqlserverfast.com/blog/
SQL Server Execution Plan Reference: https://sqlserverfast.com/epr/

Hugo Kornelis SSC Guru Points: 64790 More actions · Answer 12

Stephen.Rice (2/6/2012)
And another way to deal with 0 and negatives it if you don't want to use a 'case' statement :
MIN(ABS(SIGN(num))) * (1-(SUM(SIGN(num) * (SIGN(num) -1))%4)) * POWER(10.,SUM(LOG10(ABS(Num) + 1 - ABS(SIGN(num)))))

It's a real challenge to decipher what it does and how it works (I would never allow this in production code without extensive comments!) - but I like how it manages to avoid all explicit CASE expressions, and all implicit CASE expressions (NULLIF, COALESCE) as well. I'm not sure if it's true, but I suspect that CASE can be a tad slower than plain arithmetical functions.

Hugo Kornelis, SQL Server/Data Platform MVP (2006-2016)
Visit my SQL Server blog: https://sqlserverfast.com/blog/
SQL Server Execution Plan Reference: https://sqlserverfast.com/epr/

Stephen.Rice Grasshopper Points: 24 More actions · Answer 13

Sorry! I definitely should have added some comments.

The first section deals with zeros. ABS(SIGN()) for minus numbers or positive numbers being 1 and zero being 0 so the MIN will be 0 if any zeros exist in the input set.

The next section deals with negatives. SIGN(Num) * (SIGN(NUM)-1) yields 0 for positive or zero numbers and 2 for negative numbers. So we add them up and take a mod 4 to give 2 when there are an odd number of negatives or 0 where there are an even number.

An even number in the aggregate leaves an even result so we need to leave alone. An odd number turns the product negative. So by taking 1- (the above) we get 1 for all positive numbers or even numbers of negatives or -1 for odd numbers of negatives.

Finally we just need to prevent the LOG function from throwing an error if we pass in a zero. We can do that by adding 1 and taking off the ABS(SIGN()) of the number. This does not change the result for any non zero number but alters zeros to ones (which prevents an error being thrown).

As any product with a zero will yield zero (and we deal with that case in the first clause) it doesn't matter that we 'muck up' with the log here.

Stephen

cdac.nilesh Newbie Points: 1 More actions · Answer 14

cdac.nilesh

Newbie

Points: 1

February 6, 2012 at 5:14 am

#1443025

It is useful..!!