Search code examples
pythonpandasnumpyexport-to-csvnumpy-ndarray

How do 'numpy.ndarray' object do not 'numpy.ndarray' object?


When you call DataFrame.to_numpy(), pandas will find the NumPy dtype that can hold all of the dtypes in the DataFrame. But how to perform the reverse operation?

I have an 'numpy.ndarray' object 'pred'. It looks like this:

[[0.00599913 0.00506044 0.00508315 ... 0.00540191 0.00542058 0.00542058]]

I am trying to do like this:

 pred = np.uint8(pred)
 print("Model predict:\n", pred.T)

But I get:

[[0 0 0 ... 0 0 0]]

Why, after the conversion, I do not get something like this:

0 0 0 0 0 0 ... 0 0 0 0 0 0

And how to write the pred to a file?

pred.to_csv('pred.csv', header=None, index=False)
pred = pd.read_csv('pred.csv', sep=',', header=None)

Gives an error message:

AttributeError                            Traceback (most recent call last)
<ipython-input-68-b223b39b5db1> in <module>()
----> 1 pred.to_csv('pred.csv', header=None, index=False)
      2 pred = pd.read_csv('pred.csv', sep=',', header=None)
AttributeError: 'numpy.ndarray' object has no attribute 'to_csv'

Please help me figure this out.


Solution

  • pred is an ndarray. It does not have a to_csv method. That's something a pandas DataFrame has.

    But lets look at the first stuff.

    Copying your array display, adding commas, lets me make a list:

    In [1]: alist = [[0.00599913, 0.00506044, 0.00508315, 0.00540191, 0.00542058, 0.
       ...: 00542058]]                                                              
    In [2]: alist                                                                   
    Out[2]: [[0.00599913, 0.00506044, 0.00508315, 0.00540191, 0.00542058, 0.00542058]]
    

    and make an array from that:

    In [3]: arr = np.array(alist) 
    In [8]: print(arr)                                                              
    [[0.00599913 0.00506044 0.00508315 0.00540191 0.00542058 0.00542058]]
    

    or the repr display that ipython gives as the default:

    In [4]: arr                                                                     
    Out[4]: 
    array([[0.00599913, 0.00506044, 0.00508315, 0.00540191, 0.00542058,
            0.00542058]])
    

    Because of the double brackets, this is a 2d array. Its transpose will have shape (6,1).

    In [5]: arr.shape                                                               
    Out[5]: (1, 6)
    

    Conversion to uint8 works as expected (I prefer the astype version). But

    In [6]: np.uint8(arr)                                                           
    Out[6]: array([[0, 0, 0, 0, 0, 0]], dtype=uint8)
    In [7]: arr.astype('uint8')                                                     
    Out[7]: array([[0, 0, 0, 0, 0, 0]], dtype=uint8)
    

    The converted shape is as before (1,6).

    The conversion is nearly meaningless. The values are all small between 1 and 0. Converting to small (1 byte) unsigned integers predictably produces all 0s.