Round to nearest, ties to even (or round half to even)

LibreOffice Calc’s ROUND function uses the round to nearest, ties away from zero (or round half away from zero) rounding method. This is not the rounding method recommended by IEEE 754-2019 and ISO/IEC 60559:2020, so my first question is: is there a function in Calc that uses this rounding method?

I wasn’t able to find a positive answer for the above, but I found this and this other threads. Unfortunately all the suggested formulae are either wrong or work only for integers or cents, so I made a Calc document with sources and a comparison between rounding methods (1st sheet) and hopefully correct methods to round numbers with the round half to even rounding method mentioned above, to a specified decimal digit (2nd sheet) or significant digit (3rd sheet). I tried to make it with as simple steps as possible, while commenting them as best as I could. I have some questions regarding this document:

  • are there combinations of numbers and digits to round to that return an incorrect value?
  • is it possible to simplify the number of steps and/or change them to get rid of machine errors?
  • is it possible to make user-defined functions that will work on every spreadsheet instead of having to use a few cells to store variables for every number I want to round?

Thank you in advance.

is it possible to make user-defined functions that will work on every spreadsheet instead of having to use a few cells to store variables for every number I want to round?

No (at least in Basic, at least yet) - although LibreOffice Basic has a Round VBA compatibility function, and its VBA documentation requires it to implement banker’s rounding, LibreOffice implementation does not implement that.

tdf#142922

An interesting question, and food for thought.

For what it’s worth, none of my applications use rounding. Results are formatted to display one or two more significant digits than are required in practice, then if necessary I just “round” them in my head.

Mike Kaganski, thank you for your answer and for opening a bug report.

ve3oat, your approach is interesting, but I am required to do some further math on top of those rounded values. Inputting them as new variables in other cells it’s not something I’d like to do.

@Echo , if you are doing “further math” on the rounded values, your final results will be only an approximation of the true answer. They will be in error, perhaps dangerously so, depending on what is involved in the “further math”.

It would be much more accurate, and safer, to do the further math on un-rounded values. You can round off the final results if you prefer, but IMHO one should never use rounded values as part of a calculation, without a full evaluation of the possible consequences. (Using rounded values in a calculation leads to “error propagation”, an important subject usually presented in an unnecessarily theoretical manner.)

Depending on the steps involved in the “further math”, you might be fortunate and not be adversely affected by all of this. On the other hand, …

Thank you for your concern. I know that that is correct way to minimize the error on the end result, but unfortunately I’m forced to make such roundings because of various circumstances.

@ve3oat: '@Echo ‘if you are doing “further math” on the rounded values, your final results will be only an approximation of the true answer.’ that is! the standard in spreadsheets nowadays, they do! rounding to what they assume! best fits users expectations and declare that ‘IEEE Standard’.
It’s not, it’s what the developers decided, e.g. ‘ties away from zero’ and ‘approxValue’.
Calc: biggest impact in that direction, excel moderate, gnumeric least …
But, pls. understand, you won’t get ‘correct’ results by using pure IEEE 754, it suffers from decimal - binary conversion imprecision, additional small roundoff/-on failures in plenty operations and bigger deviations in problematic subtractions - ‘catastrophic cancellation’.

@newbie-02 : I don’t doubt what you say about the errors inherent in using pure IEEE 754, or any adaptation of it. That is why a rigorous study of the propagation of errors involved in any mission-critical algorithm or methodology is so important. One must always consider how much error is allowed by the error budget at each step in the calculation.

My point was that using intentionally rounded values at the beginning of a calculation can lead quickly to self-inflicted blow-ups. Round off the final result if you must, but never round the input data.

TL;DR: you may use this formula:

=SIGN(<NUMBER>)*((ABS(<NUMBER>)*10^<PRECISION>+2^52)-2^52)/10^<PRECISION>

where <NUMBER> is your rounded number, and <PRECISION> is number of decimals in the result. Of course, 2^52 is better replaced by its value, 4503599627370496, to avoid repeated calculations.

Explanation, credits, and disclaimer:

LibreOffice uses double precision (64-bit) IEEE 754 numbers internally.
2^52 is the smallest double that cannot have a fractional part. Adding it to a positive number results in an integer, and the value of this integer depends on current rounding mode set in the program. By default, the rounding mode mandated by IEEE 754 is round-to-even, ~all currently used processors follow this, and LibreOffice does not change this mode. It means, that when you add 2^52 to absolute value of your number, the FPU will use this mode in the process of summation, and the result will be equal to "absolute value of your number rounded to even plus 2^52". Now subtract 2^52 to obtain the rounded absolute value, apply the original sign, and you have the wanted rounded number. To have arbitrary precision, just multiply the number to appropriate power of ten before summation, and divide back after subtracting.

The method follows the code of LibreOffice’s rtl_math_round, which uses this smart technique for rtl_math_RoundingMode_HalfEven case. The method is attributed to Ross Cottrell, “the original author who placed it into public domain”.

Note that there is some possibility, that in some cases, LibreOffice might run in an environment that sets a different rounding mode. In that case, the result of the formula will be wrong. I think that actual probability of such a situation is neglectable.

Thank you Mike Kaganski both for the formula (it works as intended) and for the very informative explaination.

Regarding the formula, (ABS(<NUMBER>)*10^<PRECISION>+2^52)-2^52 could be written more simply as ABS(<NUMBER>)*10^<PRECISION>+2^52-2^52 since a machine (contrary to a human being) doesn’t solve 2^52-2^52 as 0 first.

Using that formula as a base, I made another version where <PRECISION> is the significant digit to round the <NUMBER> to, rather than the decimal digit:

SIGN(<NUMBER>)*(ABS(<NUMBER>)*10^(<PRECISION>-CEILING.PRECISE(IF(<NUMBER>=0,0,LOG10(ABS(<NUMBER>))),1))+2^52-2^52)/10^(<PRECISION>-CEILING.PRECISE(IF(<NUMBER>=0,0,LOG10(ABS(<NUMBER>))),1))

could be written more simply …

  1. This notation would not demonstrate the idea that clear. For the same reason, I didn’t use 4503599627370496 constant in the formula, but 2^52, which helps understand its relevance to the double-precision number with 52+1 bits of mantissa.
  2. You seem to rely on some specific state, with unspecified order of computation, and with a possibility that the program could introduce an optimizer at some stage; so in cases where the order of computation of operations with same precedence is crucial (like here), it’s very desirable to use parentheses explicitly.

SIGN(<NUMBER>)*(ABS(<NUMBER>)*10^(<PRECISION>-CEILING.PRECISE(IF(<NUMBER>=0,0,LOG10(ABS(<NUMBER>))),1))+2^52-2^52)/10^(<PRECISION>-CEILING.PRECISE(IF(<NUMBER>=0,0,LOG10(ABS(<NUMBER>))),1))

Note that the crucial check for <NUMBER>=0 would be best moved outside of the computation at all, like

IF(<NUMBER>=0;0;SIGN(<NUMBER>)*((ABS(<NUMBER>)*10^(<PRECISION>-CEILING(LOG10(ABS(<NUMBER>))))+2^52)-2^52)/10^(<PRECISION>-CEILING(LOG10(ABS(<NUMBER>)))))

(and 1 is the default for the CEILING.PRECISE's - and better, ISO.CEILING (it’s part of OOXML standard) - and even better CEILING (part of ODF), second argument, anyway).

And of course, remember that the multiplication/division by a power of 10 can (and will) introduce errors - which is normal with all floating-point calculations: you might find some roundings where the result would not match ideal calculation :slight_smile:

Thank you, above doe’s a better job in rounding of e.g. ‘=0,1+0,2’ to 16 decimals (this method → 0,3000000000000000; std.-calc → 0,3000000000000001). I had been searching for such for a while and build a complex construct which does the job, but this looks more elegant.
As std.-calc fails in the above task, but it is technically / mathematically solvable (also with fp-doubles (besides i have not yet checked if it works for all values)), can one say calc has a bug that should be fixed? or is it an undesirable but acceptable side effect of ‘approxValue’? or other measures that many roundings fail at the limit of representable precision?

Mike Kaganski, thank you for your your comments, I really appreciate them. So, basically, if I have to explain to someone else the formula then I use 2^52, but for personal use it is better to replace it by its value of 4503599627370496; also, it is best to use the parentheses to force the order of computation; also, for this application, CEILING is the better function of the five CEILING functions in LibreOffice Calc; finally, regarding the errors that are introduced when multiplying and dividing by 10, probably the only real solution would be to wait that the bug report you opened gets fixed in a later version.

I was experimenting a bit with the formula that uses significant digits, and I realized that if the precision is zero or negative the result is always 0, and that if it is positive but not an integer the rounding function returns a wrong result. So I modified it further:

IF(OR(<NUMBER>=0,<PRECISION><=1);0;SIGN(<NUMBER>)*((ABS(<NUMBER>)*10^(INT(<PRECISION>)-CEILING(LOG10(ABS(<NUMBER>))))+2^52)-2^52)/10^(INT(<PRECISION>)-CEILING(LOG10(ABS(<NUMBER>))))) 

Can it be further optimized or improved?

be aware that - as calc ‘as of now’ is designed with roundings and approximations - '-CEILING(LOG10(ABS(<NUMBER>)))' needs about 65 ULP’s added to e.g. 10 for the ceiling expression to switch (from -1 to -2), which might / will? make results for values from [10 … 10+1,59872115546023E-14[ inaccurate, similar applies to (most?) other decimal range borders …

if the precision is zero or negative the result is always 0

Of course: you ask to keep less than 1 significant digit. I don’t know if you actually need the change of the formula because of this: you (a) make the formula more complex for understanding, and (b) make it perform two more operations per each call (comparing <PRECISION> against 1, and executing OR) - for something that already works out of the box, and which should be rather corner case - so you optimize a case that should be rare, at the expense of pessimizing the normal (and most often called) case. Its net effect should be overall slower operation in a spreadsheet making heavy use of the formula.

I made a mistake in the formula I posted earlier: <PRECISION><=1 should be <PRECISION><1.

newbie-02, thank you for pointing that out. Would -CEILING(ROUNDSIG(LOG10(ABS(<NUMBER>))),14) solve the problem for very small and very large numbers too, or would it cause more rounding errors?

if it is positive but not an integer the rounding function returns a wrong result

It is not wrong - you just was asking for rounding to fractional number of decimal digits, which it performs correctly, but the result is not what you actually want. Using INT is one option; Basic tends to using rounding to nearest for cases where it expects integers and floats are passed - then using ROUND could be another option (and of course, ROUND(<PRECISION>) would not use “half-to-even” ;)).