calc: rounding: general precision problem?

hi @all,

one more question for today … IMHO difficult …

short ver.: does anybody know how to correctly round ‘=0,1 + 0,2’ to 16 decimals? 0,300000000000000099920 is not a good result.

‘=0,1 + 0,2’ gives a wrong result with fp-math, namely ~0,300000000000000044409 (you can have a look with ‘www.weitz.de/ieee’, and there is even an own website why this is so: https://0.30000000000000004.com/).

Calc calculates this too, but does not show the deviation (display rounded to 15 digits). Calc continues to calculate with the wrong value, and with ‘=RAWSUBTRACT(0,1 + 0,2; 0,3)’ → ~5,55111512312578E-17 you can display it.

(up to here Excel is just as dumb, it doesn’t know ‘rawsubtract’, you have to put the formula in parentheses instead).

Normally it should be quite easy to repair 0,30000000000000004 by rounding to 16 decimal places. Unfortunately, this only works in Excel, which ‘fixes’ the value even if you round to 25 decimal places (the error should actually be preserved).

Calc is ‘worse’ and makes the error bigger when rounding! ‘=ROUND(0,1 + 0,2; 16)’ gives something with ~0,300000000000000099920, of which Calc does not show the last disturbing digits, but ‘=RAWSUBTRACT(ROUND(0,1 + 0,2; 16); 0,3)’ → 1,11022302462516E-16.

(I guess calc compares for rounding with the ‘correct’ double value for 0,3 and that is a bit smaller than real 0,3 namely ~0,299999999999999988898, from that the rawsubtract display comes up with 5,5 instead of 4,4; and as the difference is over something with 5 calc thinks to round up.)

Actually … one could simply ignore such small errors, but as:

  • they are not necessary,
  • they can add up in more complex calculations, multiply or even exponentiate and then disturb,
  • they also disturb comparisons like ‘=’, ‘<’ or ‘>’ sensitively,
  • many ‘higher functions’ are based on the correct functioning of such simple functions, …
    it would be nice if this would work better … and it is possible, see Excel.

Does anyone have an idea?

(please do not! comment ‘round to less amount of decimals’, 1. that could fail too (for other values) 2. i’d like to preserve the full value.)

Calc’s ROUND is implemented using RoundNumber, and it in turn uses rtl_math_round.

The latter first multiplies the passed number by the relevant power of 10. In this case, it will be 10^16, which has 37 significant bits in mantissa. The multiplied value (~0,30000000000000004) has 52 significant bits in its mantissa, and naturally, multiplication increases the number of significant positions (since neither number is a whole power of 2). And of course, the operation introduces own inaccuracy (since there’s no way to store more than 52 significant bits of the result in mantissa). The end result of this is exactly 3000000000000000.5.

Then, since the result is ~3e15, which is smaller than 2^52, the value is incremented by 0.5, and rounded using rtl::math::approxFloor. The latter uses rtl_math_approxValue, which doesn’t do anything in this case, so the value increased by 0.5 (i.e., exactly 3000000000000001) simply gets floored. The result (still 3000000000000001) is divided by 10^16. This gives ~0,30000000000000009992.

(By the way: I prefer Base Convert: IEEE 754 Floating Point as a more handy tool to look into the binary representation of double.)

I don’t know if there’s a more accurate method to round to specified number of decimals. I hope there is; and I can imagine that Excel could use it. However, I see a usual logical fallacy in the question:

this only works in Excel, which ‘fixes’ the value even if you round to 25 decimal places (the error should actually be preserved)

it would be nice if this would work better … and it is possible, see Excel.

The text assumes that Excel indeed is able to do that properly. But nowhere in the text there is anything that proves that. The assumption is based on a single data point. Any random result that happens closer in one application than in another does not tell anything about the accuracy of the method used in either of them. You can only tell if you perform a systematic test, having a uniformly distributed set of numbers without a common pattern (e.g., numbers with very different number of significant decimals must be tested). E.g., Excel might simply use 15 as its maximal decimals count, or (more flexibly) a value that does not exceed 15 significant digits of the passed value.

But yes, if someone suggests the algorithm, it would be nice; even if someone simply shows that this is possible using doubles (by doing proper testing of Excel), it would already be a great result.

hello @mikekaganski,

yeah, that sounds logical, the coarser granularity after multiplication increases the error even more, :frowning:

i had thought if ‘plus 0.5 minus one bit’ or ‘plus 0.5 minus 1 ULP’ could help against such, but i’m not ‘done’ with it.

I am altogether less the ‘solution finder’ than the ‘problem digger’, and this effect that a well-intentioned rounding goes wrong i had already noticed a few times, but with this example in its elementary harmfulness and difficulty became more conscious.

In my digging i look from time to time at the competitor products, gnumeric has the same problem, Excel first looks good, but then rounds too sharp.

I managed correct rounding in gnumeric with a formula monster which would make this comment too long, thus ---- continued ----

---- continued ----
the following formula calculates partly with the textual representation of a value which has more digits in gnumeric than in Calc or Excel, imho it ‘does the job’, but already is overweighted and would become worse if adapted to Calc with need to extract the digits by rawsubtract …

=value((left(text(F4,"0.0000000000000000000E+000"),if((2^(-52+int(log(abs(F4),2))))<10^(-15+int(log(abs(F4),10))),17,16))&right(text(F4,"0.0000000000000000000E+000"),5))+if(value(mid(text(F4,"0.0000000000000000000E+000"),if(2^(-52+int(log(abs(F4),2)))<10^(-15+int(log(abs(F4),10))),18,17),1))>4,(10^(if(2^(-52+int(log(abs(F4),2)))<10^(-15+int(log(abs(F4),10))),-15,-14)+int(log(abs(F4),10)))),0))  

(value to round in F4, US-locale)

and it will likely fail in calc as rawsubtract produces something with 5.5E-17 instead of 4.4E-17, the ‘5’ would trigger a round up :frowning:

thus any better / easier solution would be highly appreciated …

@mikekaganski,

(since there’s no way to store more than 52 significant bits of the result in mantissa)
Splitting hairs - sorry - imho it is 53 bits,
but … i always wanted to know that … an x86 FPU is supposed to have 80 instead of 64 bit wide registers, and if you do critical calculations there many of the standard errors are supposed to be avoidable … addressable with ‘long doubles’ or something like that.
Is this true and does calc use it?
i tried to simulate this in weitz, it works and avoids the rounding error if you calculate (0.300000000000004 * 1E16 + 0,5) / 1E16 with 68 instead of 52 bits.
‘baseconvert’ looks nice, i like weitz because it allows simple calculations.

imho it is 53 bits,

… Is about interpretation of the value, not about the physical size of mantissa.

LibreOffice cannot and should not use extended precision numbers, since they are not guaranteed to be present on all its supported platforms. LibreOffice is not limited to x86 platforms. And it can’t have different results on different platforms - that would be not interoperable and not consistent.

i had thought if ‘plus 0.5 minus one bit’ or ‘plus 0.5 minus 1 ULP’ could help against such, but i’m not ‘done’ with it.

Which would introduce a systematic error towards negative infinity.

since they are not guaranteed to be present on all its supported platforms.
understand, thought of that, but that blocks progress. what about using them and if not available display a warning ‘your hardware is not sufficient for best precision’?
different results … that would be not interoperable and not consistent.
yeahh, that’s exact what i’m saying about displaying different values than calculating with.
Which would introduce a systematic error towards negative infinity.
yeah, and … just ‘brainstorming’ … that would be ok for values where the ‘value used for representation’ e.g. 0.299999999999999988898 is below the ‘less decimals’ ‘value to be represented’ - 0.3000000000000000
and would be needed the other way around for values where the exact value of the ‘representer’ is above the represented ‘less decimals’ value as for e.g. 0.1000000000000000 <-> 0.100000000000000005551 or 0.2000000000000000 <-> 0.200000000000000011102 …
but how to detect?

but that blocks progress

No, that doesn’t. Using that would only shift the problem to higher bits, not eliminate the cause. Additionally, that’s not an actual problem that prevents efficient work of the program, and it only surprises people on first encounter; billions of documents are created and used in spreadsheet software each day, and they work as intended. The perceived problem only feels that big by tiny fraction of users. E.g., you already have spend several years trying to “improve” that without any positive result IIUC.

but how to detect?

Exactly, that’s absolutely impossible. When you have a value, you don’t have any way to know if that’s some value that is exact, or is rounded from a smaller decimal value, or from larger. The binary number is just a binary number, nothing more.

mh … don’t like ‘impossible’ … much effort - yes, stretching my capabilities - yes, impossible - no,
mostly when someone has a result he also knows about the calculation it evolved from, e.g. 0,1 + 0,2 shouldn’t have more than one decimals digits,
if i have a value in a range where max 16 decimal digits are accurate (0,3) and the value calculates to ~0,300000000000000044409 rounding should be ok, problem is to get it working correctly,
found a formula for gnumeric which is a little less cryptic:

=value(text(A6,if(2^(-52+int(log(abs(A6),2)))<10^(-15+int(log(abs(A6),10))),"0.000000000000000E+000","0.00000000000000E+000")))  

that does better rounding than ‘normal’ round as long as the deviation in the result / value doesn’t exceed 0,5 ‘decimal-ULPs’ for that range, unfortunately it does not work in calc because the digits from 15 are not displayed and what you can work out with rawsubtract is falsified by the additional deviation of the representation of 0.3 … :frowning:

when someone has a result he also knows about the calculation it evolved from

The “someone” is Calc, and it does not know which calculation a random number comes from (and even if it tried, all it could know would be “I need to ROUND a number resulting from summation of 0.200000000000000011102230246251565404236316680908203125 and 0.1000000000000000055511151231257827021181583404541015625”, which are the only numbers it has). A number may come from a complex formula, or from a UDF, or from a user input. It is not possible to know how precise the result must be. There is a word impossible; and you ignore that, still repeating the “idontbelieveyouyouallareliersiamtheonlyonewhoknowsthetruth!!!111111” mantra. But - well, not surprising.

@mikekaganski:
imho progress evolves from trying again, even if it looks difficult, even if it is! difficult, do not focus on the failures but on the goal,
if calc would do the mentioned rounding to 16 decimal places correctly things would look better for me - and calc?
remember, the question was why calc does it wrong, and whether it can be done better, not whether you have advanced knowledge about ‘impossible’,
why i insist so persistently in this detail? i’m afraid that other rounding calculations (also to fewer digits?) might fail in the same manner, and that accordingly ‘round your results’ is insufficient advice for users who have problems with the insufficient precision of fp-math (and calc),
your comments sound as if such considerations are not important to you … :frowning:

Progress evolves from thinking. Only those who try again knowing what they do actually move progress; people who keep hitting their head at the wall are not what is called “progress”.

I don’t think that your raised problems are important. But - you again and again keep missing the point.

Look: I didn’t tell it’s impossible to have a better rounding; I wrote in my answer:

I don’t know if there’s a more accurate method to round to specified number of decimals. I hope there is …

if someone suggests the algorithm, it would be nice; even if someone simply shows that this is possible using doubles (by doing proper testing of Excel), it would already be a great result.

But you ignored that, and instead you chose to try to make my knowledge (about a specific thing) to look “stopping the progress”, while in fact your actions only show how little you care about knowing and understanding. Yes I know that some things are impossible; and that’s where you keep hitting the wall, instead of thinking.

i’m afraid that other rounding calculations (also to fewer digits?) might fail in the same manner, and that accordingly ‘round your results’ is insufficient advice for users who have problems with the insufficient precision of fp-math (and calc),
your comments sound as if such considerations are not important to you … :frowning:

Yes, I don’t think that “such considerations” (specifically, “i’m afraid” not based on understanding, and without a goal to understand) are important. Because they are wrong. I even have made a step-by-step description of the algorithm, and you could realize that the expected error is really high when talking about 16th significant decimal, but exponentially decreases for smaller rounding precision. And that comes from understanding and knowledge, as opposed to “considerations” based on “i’m afraid”.

@mikekaganski:

“Yes, I don’t think that “such considerations” (specifically, “i’m afraid” not based on understanding, and without a goal to understand) are important.”
yes, i expected such an answer from you (stupid insinuations from someone who is always very mimosic himself?),
did some digging (as mentioned before: ‘once you know where to search it’s easier to find …’),
imho neither '=ROUND(RAWSUBTRACT(0,3;-48*10^-17);15)' should result in ‘0,300000000000001’, nor '=ROUND(RAWSUBTRACT(0,3;-499999950*10^-17);8)' produce ‘0,30000001’ (note that i used rawsubtract to avoid double rounding, ‘+’ failed as well).
above happens despite as well the mathematical calculation should have a result of 0,300000000000000, the binary representation of 0,3 is ‘short’, and the representation of the value before rounding has a ‘4’ in the critical ‘next digit’.
rounding to fewer decimals has ‘less’ but ‘bigger’! errors, thus they ‘matter’ … imho … not for you? ok :frowning:

rounding to fewer decimals has ‘less’ but ‘bigger’! errors

Oh, lol. So you don’t understand what “error” means.

The error in both these cases is not bigger, but less. The error is not 1e-15 or 1e-8, it is by how much the calculation was mistaken *when decided if it was closer to 0 or 1 in the last significand.

And in first case (3000000000000004884981308350688777863979339599609375), it was mistaken by ~0.0115e-15; in the second case (0.300000004999999514421205049075069837272167205810546875), the error was ~0.0000000486e-8, so the relative error was each time decreasing exponentially.

And no, it doesn’t matter much. Both cases are close to both ...00 and ...01, and the difference is each time practically insignificant (and the error is each time less). Try to put a dot by pencil in the middle of a line, and then try to see which side of the line it’s closer to. The rounding should answer you that; and its error is how much is is off when deciding. No, it’s not much off. Its relative error is about 1 ULP in binary representation. That is not a bad result; but nevertheless, I suppose it might be possible to have better results - simply because it’s possible to print the binary value in decimal notation exactly, so an algorithm printing such and then analyzing the trailing digits and then converting the resulting string back to number would be able to get exactly closest representation of the correct result - but this would be too long.

Or a better definition of rounding error would be how much farther the found value is from the given value than the correct answer. This would make the absolute value of the evaluated error exactly twice as big, but the exponential decrease of relative error would not change.

So for initial number 0.3000000000000000444089209850062616169452667236328125 and precision 1e-16: the calculated value is 0.3000000000000001 (which is then represented with an error itself, but that’s unrelated), which difference from the initial value is 0.555910790149937383830547332763671875e-16; and the correct answer 0.3 is 0.444089209850062616169452667236328125e-16 away from the initial value. The absolute error is then 0.111821580299874767661094665527343750e-16, and the relative (to precision 1e-16) error is ~0.112.

Likewise, for 0.3000000000000004884981308350688777863979339599609375 and precision 1e-15, the difference from calculated 0.3000000000000001 is 0.5115018691649311222136020660400390625e-15, and from 0.3 is 0.4884981308350688777863979339599609375e-15; the absolute error is 0.0230037383298622444272041320800781250e-15; the relative (to 1e-15) error is ~0.023.

For 0.300000004999999514421205049075069837272167205810546875 and precision 1e-8, the difference from calculated 0.30000001 is 0.5000000485578794950924930162727832794189453125e-8; from proper 0.3 is 0.4999999514421205049075069837272167205810546875e-8; absolute error is 0.0000000971157589901849860325455665588378906250e-8; relative (to 1e-8) error is ~0.000000097.

@mikekaganski:
the discussions with you are always nice, and always similar,
you try to minimize something with many words and numbers, which is a big mistake.
A rounding is a decision, and has only one quality of error, namely right or wrong, so 100%. There small errors ‘explode’.
In the not too distant future, self-driving cars will autonomously decide to pass a tree on the right or on the left, imagine you survive only on one side … and the decision contains a rounding error …