Search code examples
pythonarraysstringnumpyrounding-error

Rounding error from array -> string -> array conversion


I am trying to create an array by horizontally concatenating data in 4 columns, like so:

col1=numpy.arange(191.25,196.275,.001)[:, numpy.newaxis]
nrows=col1.shape[0]

col2=numpy.zeros((nrows,1),dtype=numpy.int)
col3=numpy.zeros((nrows,1),dtype=numpy.int)
col4=numpy.ones((nrows,1),dtype=numpy.int)

a=numpy.hstack((col1,col2,col3,col4))

Then I convert it to a string:

a_str = '\n'.join('\t'.join('%0.3f' %x for x in y) for y in a)+'\n'

And convert it back to a 2d numpy array:

a2=numpy.array(filter(None,re.split('[\n\t]+',a_str)),dtype=float).reshape(-1,4)

But now when I get FALSE, when I compare:

a[-1,0]==a2[-1,0]

When I look at the individual values, I see:

a[-1,0]=196.27500000002399
a2[-1,0]=196.27500000000001

Is there some floating point/rounding error associated with converting from array to string and back (a2 is actually closer to the desired value of 196.275 than a)? How do I make it so that the values are equal? My suspicion is that when I produce the error by initially generating col1 by iterative addition that compounds the errors in the later array indices. Does this mean I should instead explicitly enumerate the values of col1 instead, or is there a work around?


Solution

  • There fundamentally isn't really a solution to this. Generally speaking, a finite decimal string and a finite binary representation have no exact equivalents. Rounding errors will be accrued in such conversions, and rather than testing for exact equivalency, constructs like np.allclose will have to be used.