Multiplication of floating point numbers gives different results in Numpy and R

I am doing data analysis in Python (Numpy) and R. My data is a vector 795067 X 3 and computing the mean, median, standard deviation, and IQR on this data yields different results depending on whether I use Numpy or R. I crosschecked the values and it looks like R gives the "correct" value.

Median: 
Numpy:14.948499999999999
R: 14.9632

Mean: 
Numpy: 13.097945407088607
R: 13.10936

Standard Deviation: 
Numpy: 7.3927612774052083
R: 7.390328

IQR: 
Numpy:12.358700000000002
R: 12.3468

Max and min of the data are the same on both platforms. I ran a quick test to better understand what is going on here.

Multiplying 1.2*1.2 in Numpy gives 1.4 (same with R).
Multiplying 1.22*1.22 gives 1.4884 in Numpy and the same with R.
However, multiplying 1.222*1.222 in Numpy gives 1.4932839999999998 which is clearly wrong! Doing the multiplication in R gives the correct answer of 1.49324.
Multiplying 1.2222*1.2222 in Numpy gives 1.4937728399999999 and 1.493773 in R. Once more, R is correct.

In Numpy, the numbers are float64 datatype and they are double in R. What is going on here? Why are Numpy and R giving different results? I know R uses IEEE754 double-precision but I don't know what precision Numpy uses. How can I change Numpy to give me the "correct" answer?

Solution

Python

The print statement/function in Python will print single-precision floats. Calculations will actually be done in the precision specified. Python/numpy uses double-precision float by default (at least on my 64-bit machine):

import numpy

single = numpy.float32(1.222) * numpy.float32(1.222)
double = numpy.float64(1.222) * numpy.float64(1.222)
pyfloat = 1.222 * 1.222

print single, double, pyfloat
# 1.49328 1.493284 1.493284

print "%.16f, %.16f, %.16f"%(single, double, pyfloat)
# 1.4932839870452881, 1.4932839999999998, 1.4932839999999998

In an interactive Python/iPython shell, the shell prints double-precision results when printing the results of statements:

>>> 1.222 * 1.222
1.4932839999999998

In [1]: 1.222 * 1.222
Out[1]: 1.4932839999999998

R

It looks like R is doing the same as Python when using print and sprintf:

print(1.222 * 1.222)
# 1.493284

sprintf("%.16f", 1.222 * 1.222)
# "1.4932839999999998"

In contrast to interactive Python shells, the interactive R shell also prints single-precision when printing the results of statements:

> 1.222 * 1.222
[1] 1.493284

Differences between Python and R

The differences in your results could result from using single-precision values in numpy. Calculations with a lot of additions/subtractions will ultimately make the problem surface:

In [1]: import numpy

In [2]: a = numpy.float32(1.222)

In [3]: a*6
Out[3]: 7.3320000171661377

In [4]: a+a+a+a+a+a
Out[4]: 7.3320003

As suggested in the comments to your actual question, make sure to use double-precision floats in your numpy calculations.