Search code examples
rlinuxwindowscalculationexp

Different results of log and exponential in Linux and Windows in R, why and any workaround?


Here is the situation, I have a vector 0:4 which I use log1p to transform and transform back using exp(x) - 1. I have categorized the original vector 0:4 and the back-transformed vector using the same criteria but the results differ for Windows and Linux. Following is a minimal example, I have prepared,

vec_original <- 0:4
log_vec <- log1p(vec_original)
vec_bktrans <- exp(log_vec) - 1
cat_original <- cut(vec_original, c(0, 1, 2, 4, Inf), include.lowest = TRUE)
cat_bktrans <- cut(vec_bktrans, c(0, 1, 2, 4, Inf), include.lowest = TRUE)

data.frame(
  original = format(vec_original, digits = 20),
  bk_trans = format(vec_bktrans, digits = 20),
  cat_original = cat_original,
  cat_bktrans = cat_bktrans
)

When running the same code in Linux, I get the following output (see line number 3),

  original              bk_trans cat_original cat_bktrans
1        0 0.0000000000000000000        [0,1]       [0,1]
2        1 1.0000000000000000000        [0,1]       [0,1]
3        2 1.9999999999999995559        (1,2]       (1,2]
4        3 3.0000000000000000000        (2,4]       (2,4]
5        4 3.9999999999999991118        (2,4]       (2,4]

In Windows, I get the following results (see line number 3),

  original              bk_trans cat_original cat_bktrans
1        0 0.0000000000000000000        [0,1]       [0,1]
2        1 1.0000000000000000000        [0,1]       [0,1]
3        2 2.0000000000000004441        (1,2]       (2,4]
4        3 3.0000000000000000000        (2,4]       (2,4]
5        4 3.9999999999999991118        (2,4]       (2,4]

Can anyone please explain what is the cause of this?


Solution

  • This is speculative, but I'm guessing that the difference between the platforms (which may depend on details of compiler options and the actual CPU architecture as much as the OS) has to do with the optional use of 80-bit extended precision registers somewhere in the computational pathway, e.g. see this question or this question. There is far more information on extended precision here:

    The foregoing remarks are not intended to disparage extended-based systems but to expose several fallacies, the first being that all IEEE 754 systems must deliver identical results for the same program.
    ... As a result, despite nearly universal conformance to (most of) the IEEE 754 standard throughout the computer industry, programmers of portable software must continue to cope with unpredictable floating-point arithmetic.

    The higher-level answer to this question is (unfortunately) "never rely on exact equality in floating-point calculations across platforms". "Any workaround" would be to round the results, as suggested by @thelatemail, or use something like all.equal() to check approximate equality (the exact answer would depend on what use you want to make of these results downstream, and why it matters that they be exactly equivalent ...)