Round to nearest, ties to even (or round half to even)

Echo · June 18, 2021, 1:12pm

Mike Kaganski, thank you for your answer and for opening a bug report.

ve3oat, your approach is interesting, but I am required to do some further math on top of those rounded values. Inputting them as new variables in other cells it’s not something I’d like to do.

ve3oat · June 18, 2021, 2:56pm

@Echo , if you are doing “further math” on the rounded values, your final results will be only an approximation of the true answer. They will be in error, perhaps dangerously so, depending on what is involved in the “further math”.

It would be much more accurate, and safer, to do the further math on un-rounded values. You can round off the final results if you prefer, but IMHO one should never use rounded values as part of a calculation, without a full evaluation of the possible consequences. (Using rounded values in a calculation leads to “error propagation”, an important subject usually presented in an unnecessarily theoretical manner.)

Depending on the steps involved in the “further math”, you might be fortunate and not be adversely affected by all of this. On the other hand, …

Echo · June 18, 2021, 9:09pm

Thank you for your concern. I know that that is correct way to minimize the error on the end result, but unfortunately I’m forced to make such roundings because of various circumstances.

newbie-02 · June 22, 2021, 2:37pm

@ve3oat: '@Echo ‘if you are doing “further math” on the rounded values, your final results will be only an approximation of the true answer.’ that is! the standard in spreadsheets nowadays, they do! rounding to what they assume! best fits users expectations and declare that ‘IEEE Standard’.
It’s not, it’s what the developers decided, e.g. ‘ties away from zero’ and ‘approxValue’.
Calc: biggest impact in that direction, excel moderate, gnumeric least …
But, pls. understand, you won’t get ‘correct’ results by using pure IEEE 754, it suffers from decimal - binary conversion imprecision, additional small roundoff/-on failures in plenty operations and bigger deviations in problematic subtractions - ‘catastrophic cancellation’.

ve3oat · June 23, 2021, 1:18am

@newbie-02 : I don’t doubt what you say about the errors inherent in using pure IEEE 754, or any adaptation of it. That is why a rigorous study of the propagation of errors involved in any mission-critical algorithm or methodology is so important. One must always consider how much error is allowed by the error budget at each step in the calculation.

My point was that using intentionally rounded values at the beginning of a calculation can lead quickly to self-inflicted blow-ups. Round off the final result if you must, but never round the input data.

mikekaganski · June 17, 2021, 10:13pm

TL;DR: you may use this formula:

=SIGN(<NUMBER>)*((ABS(<NUMBER>)*10^<PRECISION>+2^52)-2^52)/10^<PRECISION>

where <NUMBER> is your rounded number, and <PRECISION> is number of decimals in the result. Of course, 2^52 is better replaced by its value, 4503599627370496, to avoid repeated calculations.

Explanation, credits, and disclaimer:

LibreOffice uses double precision (64-bit) IEEE 754 numbers internally.
2^52 is the smallest double that cannot have a fractional part. Adding it to a positive number results in an integer, and the value of this integer depends on current rounding mode set in the program. By default, the rounding mode mandated by IEEE 754 is round-to-even, ~all currently used processors follow this, and LibreOffice does not change this mode. It means, that when you add 2^52 to absolute value of your number, the FPU will use this mode in the process of summation, and the result will be equal to "absolute value of your number rounded to even plus 2^52". Now subtract 2^52 to obtain the rounded absolute value, apply the original sign, and you have the wanted rounded number. To have arbitrary precision, just multiply the number to appropriate power of ten before summation, and divide back after subtracting.

The method follows the code of LibreOffice’s rtl_math_round, which uses this smart technique for rtl_math_RoundingMode_HalfEven case. The method is attributed to Ross Cottrell, “the original author who placed it into public domain”.

Note that there is some possibility, that in some cases, LibreOffice might run in an environment that sets a different rounding mode. In that case, the result of the formula will be wrong. I think that actual probability of such a situation is neglectable.

Echo · June 18, 2021, 1:26pm

Thank you Mike Kaganski both for the formula (it works as intended) and for the very informative explaination.

Regarding the formula, (ABS(<NUMBER>)*10^<PRECISION>+2^52)-2^52 could be written more simply as ABS(<NUMBER>)*10^<PRECISION>+2^52-2^52 since a machine (contrary to a human being) doesn’t solve 2^52-2^52 as 0 first.

Using that formula as a base, I made another version where <PRECISION> is the significant digit to round the <NUMBER> to, rather than the decimal digit:

SIGN(<NUMBER>)*(ABS(<NUMBER>)*10^(<PRECISION>-CEILING.PRECISE(IF(<NUMBER>=0,0,LOG10(ABS(<NUMBER>))),1))+2^52-2^52)/10^(<PRECISION>-CEILING.PRECISE(IF(<NUMBER>=0,0,LOG10(ABS(<NUMBER>))),1))

mikekaganski · June 18, 2021, 1:36pm

could be written more simply …

This notation would not demonstrate the idea that clear. For the same reason, I didn’t use 4503599627370496 constant in the formula, but 2^52, which helps understand its relevance to the double-precision number with 52+1 bits of mantissa.
You seem to rely on some specific state, with unspecified order of computation, and with a possibility that the program could introduce an optimizer at some stage; so in cases where the order of computation of operations with same precedence is crucial (like here), it’s very desirable to use parentheses explicitly.

mikekaganski · June 18, 2021, 1:44pm

SIGN(<NUMBER>)*(ABS(<NUMBER>)*10^(<PRECISION>-CEILING.PRECISE(IF(<NUMBER>=0,0,LOG10(ABS(<NUMBER>))),1))+2^52-2^52)/10^(<PRECISION>-CEILING.PRECISE(IF(<NUMBER>=0,0,LOG10(ABS(<NUMBER>))),1))

Note that the crucial check for <NUMBER>=0 would be best moved outside of the computation at all, like

IF(<NUMBER>=0;0;SIGN(<NUMBER>)*((ABS(<NUMBER>)*10^(<PRECISION>-CEILING(LOG10(ABS(<NUMBER>))))+2^52)-2^52)/10^(<PRECISION>-CEILING(LOG10(ABS(<NUMBER>)))))

(and 1 is the default for the CEILING.PRECISE's - and better, ISO.CEILING (it’s part of OOXML standard) - and even better CEILING (part of ODF), second argument, anyway).

mikekaganski · June 18, 2021, 2:24pm

And of course, remember that the multiplication/division by a power of 10 can (and will) introduce errors - which is normal with all floating-point calculations: you might find some roundings where the result would not match ideal calculation

newbie-02 · June 18, 2021, 3:51pm

Thank you, above doe’s a better job in rounding of e.g. ‘=0,1+0,2’ to 16 decimals (this method → 0,3000000000000000; std.-calc → 0,3000000000000001). I had been searching for such for a while and build a complex construct which does the job, but this looks more elegant.
As std.-calc fails in the above task, but it is technically / mathematically solvable (also with fp-doubles (besides i have not yet checked if it works for all values)), can one say calc has a bug that should be fixed? or is it an undesirable but acceptable side effect of ‘approxValue’? or other measures that many roundings fail at the limit of representable precision?

Echo · June 18, 2021, 9:09pm

Mike Kaganski, thank you for your your comments, I really appreciate them. So, basically, if I have to explain to someone else the formula then I use 2^52, but for personal use it is better to replace it by its value of 4503599627370496; also, it is best to use the parentheses to force the order of computation; also, for this application, CEILING is the better function of the five CEILING functions in LibreOffice Calc; finally, regarding the errors that are introduced when multiplying and dividing by 10, probably the only real solution would be to wait that the bug report you opened gets fixed in a later version.

Echo · June 18, 2021, 9:10pm

I was experimenting a bit with the formula that uses significant digits, and I realized that if the precision is zero or negative the result is always 0, and that if it is positive but not an integer the rounding function returns a wrong result. So I modified it further:

IF(OR(<NUMBER>=0,<PRECISION><=1);0;SIGN(<NUMBER>)*((ABS(<NUMBER>)*10^(INT(<PRECISION>)-CEILING(LOG10(ABS(<NUMBER>))))+2^52)-2^52)/10^(INT(<PRECISION>)-CEILING(LOG10(ABS(<NUMBER>)))))

Can it be further optimized or improved?

newbie-02 · June 19, 2021, 5:07am

be aware that - as calc ‘as of now’ is designed with roundings and approximations - '-CEILING(LOG10(ABS(<NUMBER>)))' needs about 65 ULP’s added to e.g. 10 for the ceiling expression to switch (from -1 to -2), which might / will? make results for values from [10 … 10+1,59872115546023E-14[ inaccurate, similar applies to (most?) other decimal range borders …

mikekaganski · June 19, 2021, 7:48am

if the precision is zero or negative the result is always 0

Of course: you ask to keep less than 1 significant digit. I don’t know if you actually need the change of the formula because of this: you (a) make the formula more complex for understanding, and (b) make it perform two more operations per each call (comparing <PRECISION> against 1, and executing OR) - for something that already works out of the box, and which should be rather corner case - so you optimize a case that should be rare, at the expense of pessimizing the normal (and most often called) case. Its net effect should be overall slower operation in a spreadsheet making heavy use of the formula.

Echo · June 19, 2021, 7:52am

I made a mistake in the formula I posted earlier: <PRECISION><=1 should be <PRECISION><1.

newbie-02, thank you for pointing that out. Would -CEILING(ROUNDSIG(LOG10(ABS(<NUMBER>))),14) solve the problem for very small and very large numbers too, or would it cause more rounding errors?

mikekaganski · June 19, 2021, 7:53am

if it is positive but not an integer the rounding function returns a wrong result

It is not wrong - you just was asking for rounding to fractional number of decimal digits, which it performs correctly, but the result is not what you actually want. Using INT is one option; Basic tends to using rounding to nearest for cases where it expects integers and floats are passed - then using ROUND could be another option (and of course, ROUND(<PRECISION>) would not use “half-to-even” ;)).

mikekaganski · June 19, 2021, 7:55am

regarding the errors that are introduced when multiplying and dividing by 10, probably the only real solution would be to wait that the bug report you opened gets fixed in a later version

No, these errors will not be fixed completely. You can’t avoid floating-point operations introducing errors. Sometimes they may be made a bit smaller, but not removed completely.

These errors are what the spammer @newbie-02 is obsessed about. They spam every topic dealing with floating-point calculations, no matter if it is even related. Like here, where they didn’t start a new topic, but just started to pollute a clean discussion with unrelated stuff.

Echo · June 19, 2021, 1:06pm

Mike Kaganski, again, thank you for you very informative comments. I’m learning a lot from them.

of course, ROUND(<PRECISION>) would not use “half-to-even”

[Sarcasm]I can substitute <PRECISION> with

SIGN(<PRECISION>)*((ABS(<PRECISION>)+2^52)-2^52)

in the relevant formula[/Sarcasm], however ROUNDSIG doesn’t round the second argument to the nearest integer, but it directly rounds it down (e.g. ROUNDSIG(1.11;1.99) returns 1 instead of 1.1). If <PRECISION> is manually inserted, then I think INT is better (e.g. -1.01 decimal digits are less precise than -1 decimal digits, so it should round to -2 decimal digits), however if <PRECISION> is the result of another formula and the user expects it to be an integer, then I think ROUND would be the better option.

newbie-02 · June 27, 2021, 8:51pm

after a short joy that the Ross Cottrell algorithm works for some values also for rounding to 16 digits (absolute) I came across limits, e.g. the ‘4925’ which should be cut off from ‘0,1656105402301851 | 4925’ for rounding to 16 digits are revalued to ‘~,5’ when multiplying with 1E+16, which of course has a different and undesired effect than ‘~,4~’ when rounding.
So this procedure is not usable for correct roundings ‘in the border area’.
Rounding there makes sense though, besides ‘~851,0’ - ‘~851,2’, ‘~851,5’ and ‘~851,8’ are still binary representable, can occur both meaningful and as a result of fp errors, and it would be good to be able to round them away.
— continued —