Search code examples
pythonpandascsvfillna

pd.fillna replacement value not appearing in correct position in csv file


I am very new to python and trying to fill NaN values in a very large csv file with a space using pd.fillna but the spaces all appear at the end of each row with NaN values. Code is listed below.

reader = pd.read_csv(path, sep='\s+', skipinitialspace=True, dtype=str, engine='python', names=headers)
spacedReader = reader.fillna(value=' ', axis=1)
spacedReader.to_csv(path, sep=',', quoting=csv.QUOTE_NONE, escapechar=' ', header=headers)
001  alb  272  nc      
002  NaN  NaN  nc
NaN  alb  768  nc

will produce...

001,alb,272,nc
002,nc, , 
alb,768,nc, 

And I need...

001,alb,272,nc
002, , , nc
 ,alb,768,nc

I would greatly appreciate any help!


Solution

  • You can try something like this

    import numpy as np
    import pandas as pd
    
    df = pd.DataFrame([[np.nan, 2, np.nan, 0],
                       [3, 4, np.nan, 1],
                       [np.nan, np.nan, np.nan, 5],
                       [np.nan, 3, np.nan, 4]],
                      columns=list('ABCD'))
    
    
    df1 = df.replace(np.nan, '', regex=True)
    
    df1.head()
    

    output

        A   B   C   D
    0       2       0
    1   3   4       1
    2               5
    3       3       4