Aggregate Function Product()

  • pmcpherson

    SSC-Addicted

    Points: 453

    Comments posted to this topic are about the item Aggregate Function Product()

  • SwePeso

    SSC-Dedicated

    Points: 39693

    I hate to be the party pooper but change one value to a negative value and try again.

    Also see http://weblogs.sqlteam.com/peterl/archive/2008/11/19/How-to-get-the-productsum-from-a-table.aspx


    N 56°04'39.16"
    E 12°55'05.25"

  • Mike McIver

    SSC-Addicted

    Points: 493

    Maybe this to account for signs . . .

    EXP(SUM(LOG(ABS([RowValue])))) * SIGN(CHECKSUM_AGG([RowValue]))

  • Hugo Kornelis

    SSC Guru

    Points: 64685

    The method in the article is only good for positive (>0) values. Here is a formula that works for zeroes and negatives as well:

    COALESCE(EXP(SUM(LOG(ABS(NULLIF(val, 0))))),0) * (1 - 2 * (COUNT(CASE WHEN val < 0 THEN 1 END) % 2))

    The first part uses NULLIF to replace 0 with NULL, ABS to change negative values to positive ones, then pours that into the log-based product formula (except that this one uses natural logarithms instead of 10-based logarithms). Because of NULL propagation, a single 0 in the input set (which is converted to NULL by NULLIF) will cause the result to be NULL; the COALESCE function fixes this by replacing NULL with 0. The result of this first part is the product of the absolute value of all numbers.

    The second part of the formula counts the number of negative values in the input set, then calculates the remainder after division by 2 - resulting in 0 for an even number of negatives, and 1 for an odd number. This value is multiplied by 2 and subtracted from 1, resulting in 1 if there's an even number of negatives, and -1 if there's an odd number. Multiply this by the product of the absolute values to get the final answer.

    I first saw this technique in a book by Itzik Ben-Gan (who else?). I now had to google for it, and found my first hit in a comment by Rob Farley to this blog post by Michael Coles - though I did change and simplify Rob's version a bit - he had omitted the COALESCE and instead used a count of the number of zero values - a technique that I don't understand and that, frankly, probably doesn't work.

    One problem with this approach is that there is no way to distinguish a NULL that was introduced by the ISNULL (and hence was a 0 at first) from a NULL that was already NULL in the input set, so the result will be 0 if one of the input values is NULL, whereas the proper result would be either NULL (if you want to include the NULL input in the calculation), or the product of the non-NULL values (if you want the aggregate to work like all other aggregates do). I guess the best way to fix this would be to exclude NULL values in the WHERE clause, or to add some extra CASE expressions.

    I als found a more readable version that uses the same technique, but broken down into steps, in a reply to this post on stackoverflow. This reply was posted by gbn.

    SELECT

    GrpID,

    CASE

    WHEN MinVal = 0 THEN 0

    WHEN Neg % 2 = 1 THEN -1 * EXP(ABSMult)

    ELSE EXP(ABSMult)

    END

    FROM

    (

    SELECT

    GrpID,

    --log of +ve row values

    SUM(LOG(ABS(NULLIF(Value, 0)))) AS ABSMult,

    --count of -ve values. Even = +ve result.

    SUM(SIGN(CASE WHEN Value < 0 THEN 1 ELSE 0 END)) AS Neg,

    --anything * zero = zero

    MIN(ABS(Value)) AS MinVal

    FROM

    Mytable

    GROUP BY

    GrpID

    ) foo

    This version is more readable, but has the same problem with NULL versions as the compact version above.


    Hugo Kornelis, SQL Server/Data Platform MVP (2006-2016)
    Visit my SQL Server blog: https://sqlserverfast.com/blog/
    SQL Server Execution Plan Reference: https://sqlserverfast.com/epr/

  • Hugo Kornelis

    SSC Guru

    Points: 64685

    I just saw the answer by Mike. I like the more compact way to get the multiplication factor (CHECKSUM_AGG), but I'm also a bit concerned - I could not find any documentation (at least not in the 30 seconds I spent searching -I know, pathetic!-) that this aggregate guarantees a negative result if the number of negative input values is odd, and a positive result otherwise. Without such documentation, this method appears to be less safe.

    And a final comment regarding the original article that I forgot in my previous post:

    "Notice the decimal point after the first argument of the power function. You need it to force a precession of 18 so that 4 * 2 equal 8 instead of 7."

    The decimal point does not force a precision of 18, it forces the use of non-integer (numeric(5,2) to be precise). Without this, all numbers after the decimal place are cut off. That's why most versions of this algorithm on the internet use natural logarithm rather than 10-based logarithm - that one always uses non-integer calculation.


    Hugo Kornelis, SQL Server/Data Platform MVP (2006-2016)
    Visit my SQL Server blog: https://sqlserverfast.com/blog/
    SQL Server Execution Plan Reference: https://sqlserverfast.com/epr/

  • Paul White

    SSC Guru

    Points: 150442

    We use a CLR aggregate for this.

  • archie flockhart

    SSCrazy

    Points: 2339

    I'd be interested to know what problems require the calculation of a product from each row in a query in this way.

    Presumably also, you'd need to be aware of the number of values and size of values that are expected - it wouldn't take very many numbers multiplied together for the product to get very large. (Or very small, if the numbers being multiplied are between -1 and 1 )

  • David.Poole

    SSC Guru

    Points: 75313

    archie flockhart (2/6/2012)


    I'd be interested to know what problems require the calculation of a product from each row in a query in this way.

    Calculating correlation statistics

  • archie flockhart

    SSCrazy

    Points: 2339

    David.Poole (2/6/2012)


    archie flockhart (2/6/2012)


    I'd be interested to know what problems require the calculation of a product from each row in a query in this way.

    Calculating correlation statistics

    It's been a long time since I did any correlations but I can't remember needing to get a product of all the numbers in a dataset, and a quick google for the formulas didn't throw up anything that required Product( X1 .. Xn)

    Can you point me at more details ?

  • Hugo Kornelis

    SSC Guru

    Points: 64685

    archie flockhart (2/6/2012)


    Presumably also, you'd need to be aware of the number of values and size of values that are expected - it wouldn't take very many numbers multiplied together for the product to get very large. (Or very small, if the numbers being multiplied are between -1 and 1 )

    If I ever need to do this for real, I'll amost certainly add a CAST to float, to make sure that results of virtually all sizes can be represented.


    Hugo Kornelis, SQL Server/Data Platform MVP (2006-2016)
    Visit my SQL Server blog: https://sqlserverfast.com/blog/
    SQL Server Execution Plan Reference: https://sqlserverfast.com/epr/

  • Stephen.Rice

    Grasshopper

    Points: 23

    And another way to deal with 0 and negatives it if you don't want to use a 'case' statement :

    MIN(ABS(SIGN(num))) * (1-(SUM(SIGN(num) * (SIGN(num) -1))%4)) * POWER(10.,SUM(LOG10(ABS(Num) + 1 - ABS(SIGN(num)))))

  • Hugo Kornelis

    SSC Guru

    Points: 64685

    SQL Kiwi (2/6/2012)


    We use a CLR aggregate for this.

    That's definitely cleaner, and easier to understand code. But how does the performance of a CLR aggregate compare to performance of a complicated formula using native functions only? Have you ever done any perf testing?


    Hugo Kornelis, SQL Server/Data Platform MVP (2006-2016)
    Visit my SQL Server blog: https://sqlserverfast.com/blog/
    SQL Server Execution Plan Reference: https://sqlserverfast.com/epr/

  • Hugo Kornelis

    SSC Guru

    Points: 64685

    Stephen.Rice (2/6/2012)


    And another way to deal with 0 and negatives it if you don't want to use a 'case' statement :

    MIN(ABS(SIGN(num))) * (1-(SUM(SIGN(num) * (SIGN(num) -1))%4)) * POWER(10.,SUM(LOG10(ABS(Num) + 1 - ABS(SIGN(num)))))

    It's a real challenge to decipher what it does and how it works (I would never allow this in production code without extensive comments!) - but I like how it manages to avoid all explicit CASE expressions, and all implicit CASE expressions (NULLIF, COALESCE) as well. I'm not sure if it's true, but I suspect that CASE can be a tad slower than plain arithmetical functions.


    Hugo Kornelis, SQL Server/Data Platform MVP (2006-2016)
    Visit my SQL Server blog: https://sqlserverfast.com/blog/
    SQL Server Execution Plan Reference: https://sqlserverfast.com/epr/

  • Stephen.Rice

    Grasshopper

    Points: 23

    Sorry! I definitely should have added some comments.

    The first section deals with zeros. ABS(SIGN()) for minus numbers or positive numbers being 1 and zero being 0 so the MIN will be 0 if any zeros exist in the input set.

    The next section deals with negatives. SIGN(Num) * (SIGN(NUM)-1) yields 0 for positive or zero numbers and 2 for negative numbers. So we add them up and take a mod 4 to give 2 when there are an odd number of negatives or 0 where there are an even number.

    An even number in the aggregate leaves an even result so we need to leave alone. An odd number turns the product negative. So by taking 1- (the above) we get 1 for all positive numbers or even numbers of negatives or -1 for odd numbers of negatives.

    Finally we just need to prevent the LOG function from throwing an error if we pass in a zero. We can do that by adding 1 and taking off the ABS(SIGN()) of the number. This does not change the result for any non zero number but alters zeros to ones (which prevents an error being thrown).

    As any product with a zero will yield zero (and we deal with that case in the first clause) it doesn't matter that we 'muck up' with the log here.

    Stephen

  • cdac.nilesh

    Newbie

    Points: 1

    It is useful..!!

Viewing 15 posts - 1 through 15 (of 19 total)

You must be logged in to reply to this topic. Login to reply