Search code examples
numpynangenfromtxt

Not to make 'nan' to '0' when reading the data through numpy.genfromtxt python


Now I am trying to read the array in the file named "filin1" such as:

filin1 = [1,3,4, ....,nan,nan,nan..] (in the file, actually it is just a column not an array like this)

So, I am trying to use numpy.genfromtxt as:

 np.genfromtxt(filin1,dtype=None,delimiter=',',usecols=[0],missing_values='Missing',usemask=False,filling_values=np.nan)

I expected to get [1,3,4, ....,nan,nan,nan..], but turned out to be:

[1,3,4, ....,0.,0.,0...]

I would like to hold 'nan' without converting it to '0.'.

Would you please give any idea or advice?

Thank you, Isaac


Solution

  • If I try to simulate your case with a string input, I have no problem reading the nan

    In [73]: txt=b'''1,2
    3,4
    1.23,nan
    nan,02
    '''
    In [74]: txt=txt.splitlines()
    In [75]: txt
    Out[75]: [b'1,2', b'3,4', b'1.23,nan', b'nan,02']
    In [76]: np.genfromtxt(txt,delimiter=',')
    Out[76]: 
    array([[ 1.  ,  2.  ],
           [ 3.  ,  4.  ],
           [ 1.23,   nan],
           [  nan,  2.  ]])
    

    nan is a valid float value

    In [80]: float('nan')
    Out[80]: nan
    

    Your command works also, though it does

    In [82]: np.genfromtxt(txt,dtype=None,delimiter=',',usecols=[0],missing_values='Missing',usemask=False,filling_values=np.nan)
    Out[82]: array([ 1.  ,  3.  ,  1.23,   nan])
    

    Expecting the columns to contain integers (rather than float) could cause problems, since nan is a float, not int.

    And missing values result in nan with both calls

    In [91]: txt
    Out[91]: [b'1,2', b'3,', b'1.23,nan', b'nan,02', b',']