February 5, 2012 at 10:02 pm
Comments posted to this topic are about the item Aggregate Function Product()
February 5, 2012 at 10:05 pm
I hate to be the party pooper but change one value to a negative value and try again.
Also see http://weblogs.sqlteam.com/peterl/archive/2008/11/19/How-to-get-the-productsum-from-a-table.aspx
February 6, 2012 at 1:01 am
Maybe this to account for signs . . .
EXP(SUM(LOG(ABS([RowValue])))) * SIGN(CHECKSUM_AGG([RowValue]))
Mike
MacroTrenz.Com
LinkedIn
February 6, 2012 at 1:02 am
The method in the article is only good for positive (>0) values. Here is a formula that works for zeroes and negatives as well:
COALESCE(EXP(SUM(LOG(ABS(NULLIF(val, 0))))),0) * (1 - 2 * (COUNT(CASE WHEN val < 0 THEN 1 END) % 2))
The first part uses NULLIF to replace 0 with NULL, ABS to change negative values to positive ones, then pours that into the log-based product formula (except that this one uses natural logarithms instead of 10-based logarithms). Because of NULL propagation, a single 0 in the input set (which is converted to NULL by NULLIF) will cause the result to be NULL; the COALESCE function fixes this by replacing NULL with 0. The result of this first part is the product of the absolute value of all numbers.
The second part of the formula counts the number of negative values in the input set, then calculates the remainder after division by 2 - resulting in 0 for an even number of negatives, and 1 for an odd number. This value is multiplied by 2 and subtracted from 1, resulting in 1 if there's an even number of negatives, and -1 if there's an odd number. Multiply this by the product of the absolute values to get the final answer.
I first saw this technique in a book by Itzik Ben-Gan (who else?). I now had to google for it, and found my first hit in a comment by Rob Farley to this blog post by Michael Coles - though I did change and simplify Rob's version a bit - he had omitted the COALESCE and instead used a count of the number of zero values - a technique that I don't understand and that, frankly, probably doesn't work.
One problem with this approach is that there is no way to distinguish a NULL that was introduced by the ISNULL (and hence was a 0 at first) from a NULL that was already NULL in the input set, so the result will be 0 if one of the input values is NULL, whereas the proper result would be either NULL (if you want to include the NULL input in the calculation), or the product of the non-NULL values (if you want the aggregate to work like all other aggregates do). I guess the best way to fix this would be to exclude NULL values in the WHERE clause, or to add some extra CASE expressions.
I als found a more readable version that uses the same technique, but broken down into steps, in a reply to this post on stackoverflow. This reply was posted by gbn.
SELECT
GrpID,
CASE
WHEN MinVal = 0 THEN 0
WHEN Neg % 2 = 1 THEN -1 * EXP(ABSMult)
ELSE EXP(ABSMult)
END
FROM
(
SELECT
GrpID,
--log of +ve row values
SUM(LOG(ABS(NULLIF(Value, 0)))) AS ABSMult,
--count of -ve values. Even = +ve result.
SUM(SIGN(CASE WHEN Value < 0 THEN 1 ELSE 0 END)) AS Neg,
--anything * zero = zero
MIN(ABS(Value)) AS MinVal
FROM
Mytable
GROUP BY
GrpID
) foo
This version is more readable, but has the same problem with NULL versions as the compact version above.
February 6, 2012 at 1:08 am
I just saw the answer by Mike. I like the more compact way to get the multiplication factor (CHECKSUM_AGG), but I'm also a bit concerned - I could not find any documentation (at least not in the 30 seconds I spent searching -I know, pathetic!-) that this aggregate guarantees a negative result if the number of negative input values is odd, and a positive result otherwise. Without such documentation, this method appears to be less safe.
And a final comment regarding the original article that I forgot in my previous post:
"Notice the decimal point after the first argument of the power function. You need it to force a precession of 18 so that 4 * 2 equal 8 instead of 7."
The decimal point does not force a precision of 18, it forces the use of non-integer (numeric(5,2) to be precise). Without this, all numbers after the decimal place are cut off. That's why most versions of this algorithm on the internet use natural logarithm rather than 10-based logarithm - that one always uses non-integer calculation.
February 6, 2012 at 1:59 am
We use a CLR aggregate for this.
Paul White
SQLPerformance.com
SQLkiwi blog
@SQL_Kiwi
February 6, 2012 at 2:20 am
I'd be interested to know what problems require the calculation of a product from each row in a query in this way.
Presumably also, you'd need to be aware of the number of values and size of values that are expected - it wouldn't take very many numbers multiplied together for the product to get very large. (Or very small, if the numbers being multiplied are between -1 and 1 )
February 6, 2012 at 2:55 am
archie flockhart (2/6/2012)
I'd be interested to know what problems require the calculation of a product from each row in a query in this way.
Calculating correlation statistics
February 6, 2012 at 3:22 am
David.Poole (2/6/2012)
archie flockhart (2/6/2012)
I'd be interested to know what problems require the calculation of a product from each row in a query in this way.Calculating correlation statistics
It's been a long time since I did any correlations but I can't remember needing to get a product of all the numbers in a dataset, and a quick google for the formulas didn't throw up anything that required Product( X_{1} .. X_{n})
Can you point me at more details ?
February 6, 2012 at 4:02 am
archie flockhart (2/6/2012)
Presumably also, you'd need to be aware of the number of values and size of values that are expected - it wouldn't take very many numbers multiplied together for the product to get very large. (Or very small, if the numbers being multiplied are between -1 and 1 )
If I ever need to do this for real, I'll amost certainly add a CAST to float, to make sure that results of virtually all sizes can be represented.
February 6, 2012 at 4:04 am
And another way to deal with 0 and negatives it if you don't want to use a 'case' statement :
MIN(ABS(SIGN(num))) * (1-(SUM(SIGN(num) * (SIGN(num) -1))%4)) * POWER(10.,SUM(LOG10(ABS(Num) + 1 - ABS(SIGN(num)))))
February 6, 2012 at 4:05 am
SQL Kiwi (2/6/2012)
We use a CLR aggregate for this.
That's definitely cleaner, and easier to understand code. But how does the performance of a CLR aggregate compare to performance of a complicated formula using native functions only? Have you ever done any perf testing?
February 6, 2012 at 4:13 am
Stephen.Rice (2/6/2012)
And another way to deal with 0 and negatives it if you don't want to use a 'case' statement :MIN(ABS(SIGN(num))) * (1-(SUM(SIGN(num) * (SIGN(num) -1))%4)) * POWER(10.,SUM(LOG10(ABS(Num) + 1 - ABS(SIGN(num)))))
It's a real challenge to decipher what it does and how it works (I would never allow this in production code without extensive comments!) - but I like how it manages to avoid all explicit CASE expressions, and all implicit CASE expressions (NULLIF, COALESCE) as well. I'm not sure if it's true, but I suspect that CASE can be a tad slower than plain arithmetical functions.
February 6, 2012 at 4:28 am
Sorry! I definitely should have added some comments.
The first section deals with zeros. ABS(SIGN()) for minus numbers or positive numbers being 1 and zero being 0 so the MIN will be 0 if any zeros exist in the input set.
The next section deals with negatives. SIGN(Num) * (SIGN(NUM)-1) yields 0 for positive or zero numbers and 2 for negative numbers. So we add them up and take a mod 4 to give 2 when there are an odd number of negatives or 0 where there are an even number.
An even number in the aggregate leaves an even result so we need to leave alone. An odd number turns the product negative. So by taking 1- (the above) we get 1 for all positive numbers or even numbers of negatives or -1 for odd numbers of negatives.
Finally we just need to prevent the LOG function from throwing an error if we pass in a zero. We can do that by adding 1 and taking off the ABS(SIGN()) of the number. This does not change the result for any non zero number but alters zeros to ones (which prevents an error being thrown).
As any product with a zero will yield zero (and we deal with that case in the first clause) it doesn't matter that we 'muck up' with the log here.
Stephen
February 6, 2012 at 5:14 am
It is useful..!!
Viewing 15 posts - 1 through 15 (of 19 total)
You must be logged in to reply to this topic. Login to reply