Search code examples
rsassvdsas-iml

difference between svd() and call svd, R and IML


I'm translating a package from R to IML, and this will be free online when it will be done :). I gain different results from the decomposition of a big matrix, both results seems the same when you look at them, but, if for example I take first 2 columns of U and do U'*U, my 2x2 matrix will be quite different (U_11 = 1.1e-17 and U_11 =1.4e-17). The difference is very small (3e-18) and this lead me to think that it could be something related to the number of decimals that each software use, SAS IML and R. Anyone knows something more about this topic? How can I test this? Thank you.


Solution

  • In statistics, we describe very small differences as "statistically insignificant." To a numerical analyst, differences that are less than "machine epsilon" (.Machine$double.eps in R or constant("maceps") in SAS) are almost always "numerically insignificant."

    Both SAS and R use double precision computations and are probably calling similar numerical libraries. For a difference that small, I would conjecture that the reason is not algorithmic but is because of different compiler flags and optimization flags that each software uses.

    Even within a single product, computing a result in two different orders can result in small differences like this. For example, run the following DATA step:

    data _null_;
    x = (1 + 1 + 1 + 1 + 1 + 1 + 1) / 7;
    y = (1/7 + 1/7 + 1/7 + 1/7 + 1/7 + 1/7 + 1/7);
    diff = x - y;
    put diff=;
    run;
    

    My sugggestion is to ignore "numerically insignificant" results when comparing different software. For more information about floating point computations, see The Floating Point Guide. For the real nitty gritty, see "What every computer scientist should know about floating-point arithmetic"