I am trying to create an array by horizontally concatenating data in 4 columns, like so:
col1=numpy.arange(191.25,196.275,.001)[:, numpy.newaxis]
nrows=col1.shape[0]
col2=numpy.zeros((nrows,1),dtype=numpy.int)
col3=numpy.zeros((nrows,1),dtype=numpy.int)
col4=numpy.ones((nrows,1),dtype=numpy.int)
a=numpy.hstack((col1,col2,col3,col4))
Then I convert it to a string:
a_str = '\n'.join('\t'.join('%0.3f' %x for x in y) for y in a)+'\n'
And convert it back to a 2d numpy array:
a2=numpy.array(filter(None,re.split('[\n\t]+',a_str)),dtype=float).reshape(-1,4)
But now when I get FALSE, when I compare:
a[-1,0]==a2[-1,0]
When I look at the individual values, I see:
a[-1,0]=196.27500000002399
a2[-1,0]=196.27500000000001
Is there some floating point/rounding error associated with converting from array to string and back (a2 is actually closer to the desired value of 196.275 than a)? How do I make it so that the values are equal? My suspicion is that when I produce the error by initially generating col1 by iterative addition that compounds the errors in the later array indices. Does this mean I should instead explicitly enumerate the values of col1 instead, or is there a work around?
There fundamentally isn't really a solution to this. Generally speaking, a finite decimal string and a finite binary representation have no exact equivalents. Rounding errors will be accrued in such conversions, and rather than testing for exact equivalency, constructs like np.allclose will have to be used.