SUM of FLOAT inconsistency

Question

Post reply

SUM of FLOAT inconsistency

brewmanz

SSCommitted

Points: 1575
More actions
October 11, 2008 at 8:08 pm

#194296

Comments posted to this topic are about the item SUM of FLOAT inconsistency

Viewing 15 posts - 1 through 15 (of 15 total)

You must be logged in to reply to this topic. Login to reply

Ronald H SSCarpal Tunnel Points: 4053 More actions · Answer 1

Great QotD! One of the rare ones that isn't easy to cheat on.

Thumbs up!

Ronald

Ronald HensbergenHelp us, help yourself... Post data so we can read and use it: http://www.sqlservercentral.com/articles/Best+Practices/61537/-------------------------------------------------------------------------2+2=5 for significant large values of 2

Mark Sumner SSCommitted Points: 1851 More actions · Answer 2

3 of the options seemed very unlikely, and the answer is easily verified by changing the order of @float* inserts. Cheating rules ok

Carlo Romagnano SSC-Insane Points: 22772 More actions · Answer 3

That's true:

DECLARE @SumA float, @SumB float

DECLARE @MyFloat1 float, @MyFloat2 float, @MyFloat3 float

DECLARE @MyTable table

(

ID int primary key identity,

NumA float,

NumB float

)

SET @MyFloat1 = 10000000000020000

SET @MyFloat2 = -10000000000010000

SET @MyFloat3 = 1

INSERT INTO @MyTable

SELECT @MyFloat1, CAST(@MyFloat3 AS FLOAT)

UNION

SELECT @MyFloat2, @MyFloat1

UNION ALL

SELECT @MyFloat3, @MyFloat2

SELECT SUM(NumA), SUM(NumB) FROM

(select top 100 * from @MyTable

order by 1

) AS A

SELECT SUM(NumA), SUM(NumB) FROM

(select top 100 * from @MyTable

order by 2

) AS B

Result:

(3 row(s) affected)

1000110000

(1 row(s) affected)

1000010001

(1 row(s) affected)

Chris.Strolia-Davis Old Hand Points: 315 More actions · Answer 4

Were the declarations of @SumA and @SumB supposed to confuse us on this one? I found them to be completely unnecessary.

Thanks for posting. I have often found SQL calculations that create inconsistent or incorrect results as a misunderstanding of datatype size and rounding issues which can often be fixed by simply calculating in the appropriate order.

Of course, even the most experienced query writers can miss this sort of thing sometimes. If the calculations are critical, it is usually a good idea to put in some sort of validation check so that if there are problems, they can be found early and corrected quickly.

Carlo Romagnano SSC-Insane Points: 22772 More actions · Answer 5

Main rule is never to use float, because of unwanted effetcs:

Here the version with decimal. It always is correct.

DECLARE @SumA float, @SumB float

DECLARE @MyFloat1 float, @MyFloat2 float, @MyFloat3 float

DECLARE @MyTable table

(

ID int primary key identity,

NumA DECIMAL(17,0),

NumB DECIMAL(17,0)

)

SET @MyFloat1 = 10000000000020000

SET @MyFloat2 = -10000000000010000

SET @MyFloat3 = 1

INSERT INTO @MyTable

SELECT @MyFloat1, CAST(@MyFloat3 AS FLOAT)

UNION

SELECT @MyFloat2, @MyFloat1

UNION ALL

SELECT @MyFloat3, @MyFloat2

SELECT SUM(NumA), SUM(NumB) FROM

(select top 100 * from @MyTable

order by 1

) AS A

SELECT SUM(NumA), SUM(NumB) FROM

(select top 100 * from @MyTable

order by 2

) AS B

brewmanz SSCommitted Points: 1575 More actions · Answer 6

This is my first QotD, and I'm pleased that it's gone quite well. It was prompted by outrage at missing out on points for another QotD (Sep 23, 2008, Accessing and changing data 2008) where it was deemed that the SqlServer 2008 GROUP BY GROUPING SETS was not the same as equivalent GROUP BY code because "Aggregates on floating-point numbers might return slightly different results." I protested but didn't get my 2 points back. So I wrote this QotD to prove my point, that identical aggregate code on an identical set of FLOAT numbers can ALSO produce "slightly different results", depending on the order of execution (which *should* be irrelevant in a perfect theoretical world without truncation).

BTW playing with the UNION & UNION ALL can reverse the results, but they will always (in my experience) produce the 2 differing results. Others noticed similar effects by sorting etc.

Carlo Romagnano (10/13/2008)
Main rule is never to use float, because of unwanted effects:
Here the version with decimal. It always is correct.

It always is correct WITH THIS SET OF DATA. Try adding a couple of zero(e)s at the end of each of the 3 numbers and watch "Msg 8115, Level 16, State 6, Line 12 - Arithmetic overflow error converting float to data type numeric." appear.

The FLOAT still works, and produces even stranger (but predictable) results of 1000164 and 1000192.

That's the point - if the numbers were coming from an external source (and may have decimal places), you might not have the luxury of knowing the full range of numbers used. As a chemical engineer I learnt about floating point numbers before integers, and converted all measurements to Megafurlongs per microfortnight for consistency (the speed of light is just above 1.8 with those units - far more manageable).

Chris.Strolia-Davis (10/13/2008)
Were the declarations of @SumA and @SumB supposed to confuse us on this one? I found them to be completely unnecessary.

The declarations were not intended to confuse; as a software developer for many decades (mainly with C# for the last 6 years, with some T-SQL on the side), I like to define all my variables of a type that I decide. I don't want to rely on some box of blacklegging binary bits* to make the decision for me. With the latest releases of dotNet allowing implicit declaration types, I'm a bit nervous actually.

Brewmanz

aka Bryan White, NZ

* Thanks to Douglas Adams for this term

Chris.Strolia-Davis Old Hand Points: 315 More actions · Answer 7

brewmanz (10/13/2008)
Chris.Strolia-Davis (10/13/2008)
Were the declarations of @SumA and @SumB supposed to confuse us on this one? I found them to be completely unnecessary.
The declarations were not intended to confuse; as a software developer for many decades (mainly with C# for the last 6 years, with some T-SQL on the side), I like to define all my variables of a type that I decide. I don't want to rely on some box of blacklegging binary bits* to make the decision for me. With the latest releases of dotNet allowing implicit declaration types, I'm a bit nervous actually.

All I am saying is that I don't see you actually using @SumA or @SumB in this query. They were declared, but don't seem to have been used. If you comment out that line altogether, it changes nothing ... that I'm aware of ;).

brewmanz SSCommitted Points: 1575 More actions · Answer 8

Chris.Strolia-Davis (10/13/2008)
All I am saying is that I don't see you actually using @SumA or @SumB in this query. They were declared, but don't seem to have been used. If you comment out that line altogether, it changes nothing ... that I'm aware of ;).

Doh! You are quite right.

When I re-read the QotD text, I thought that maybe it had been editorially modified as I *did* use them in my earlier playing, assigning SUM(Num%) to the @Sum% variables. Maybe they'd changed things.

But no; I've checked my draft pre-submission, and the editors are not to blame.

Mea culpa.

Please don't tell my daughter. I take great delight in proofreading her work and finding mistakes (she works for a brochure publishing company). Sadly, it seems that I am capable of making mistakes, too. Soon I'll be into double figures this century

Regards

brewmanz

antony-688446 Ten Centuries Points: 1221 More actions · Answer 9

Great QOTD, from another Douglas Adams fan, SQL Using, Kiwi! There can't be many of us!

Anipaul SSC-Insane Points: 24681 More actions · Answer 10

Anipaul

SSC-Insane

Points: 24681

October 13, 2008 at 11:12 pm

#884409

Nice question......

Carlo Romagnano SSC-Insane Points: 22772 More actions · Answer 11

brewmanz (10/13/2008)
T

It always is correct WITH THIS SET OF DATA. Try adding a couple of zero(e)s at the end of each of the 3 numbers and watch "Msg 8115, Level 16, State 6, Line 12 - Arithmetic overflow error converting float to data type numeric." appear.
The FLOAT still works, and produces even stranger (but predictable) results of 1000164 and 1000192.

I prefer an overflow error instead of wrong data. You can specify a precision of 38 digits

From BOL:

Numeric data types that have fixed precision and scale.

decimal[ (p[ , s] )] and numeric[ (p[ , s] )]

Fixed precision and scale numbers. When maximum precision is used, valid values are from - 10^38 +1 through 10^38 - 1. The SQL-92 synonyms for decimal are dec and dec(p, s). numeric is functionally equivalent to decimal.

p (precision)

The maximum total number of decimal digits that can be stored, both to the left and to the right of the decimal point. The precision must be a value from 1 through the maximum precision of 38. The default precision is 18.

s (scale)

The maximum number of decimal digits that can be stored to the right of the decimal point. Scale must be a value from 0 through p. Scale can be specified only if precision is specified. The default scale is 0; therefore, 0 <= s <= p. Maximum storage sizes vary, based on the precision.

Chris.Strolia-Davis Old Hand Points: 315 More actions · Answer 12

brewmanz (10/13/2008)
Please don't tell my daughter. I take great delight in proofreading her work and finding mistakes (she works for a brochure publishing company). Sadly, it seems that I am capable of making mistakes, too. Soon I'll be into double figures this century

Ha ha, it happens to the best of us my friend. Seems like life enjoys dishing out the humble pie every now and then.

Thanks again for the article. It's good for developers to be aware of, and judging by the results of the quiz, there are quite a few out there who could use the enlightenment.

Todd Carrier Hall of Fame Points: 3951 More actions · Answer 13

Brewmanz... keep them coming. That was great, and exactly what we all need! Good explanation, too. I found that it was dependent upon order, but could not figure out why. A+++ QotD!

Todd Carrier
MCITP - Database Administrator (SQL 2008)
MCSE: Data Platform (SQL 2012)

Nakul Vachhrajani SSChampion Points: 10294 More actions · Answer 14

You can validate this using Excel as well. You will find the same thing in Excel also (Refer attachment)

Excellent post!

Thanks & Regards,
Nakul Vachhrajani.
http://nakulvachhrajani.com

Follow me on
Twitter: @sqltwins

SUM of FLOAT inconsistency

Cookies on SQLServerCentral