Search code examples
indexingpandasdelete-row

How to delete a row in a Pandas DataFrame and relabel the index?


I am reading a file into a Pandas DataFrame that may have invalid (i.e. NaN) rows. This is sequential data, so I have row_id+1 refer to row_id. When I use frame.dropna(), I get the desired structure, but the index labels stay as they were originally assigned. How can the index labels get reassigned 0 to N-1 where N is the number of rows after dropna()?


Solution

  • Use pandas.DataFrame.reset_index(), the option drop=True will do what you are looking for.

    In [14]: df = pd.DataFrame(np.random.randn(5,4))
    
    In [15]: df.ix[::3] = np.nan
    
    In [16]: df
    Out[16]:
              0         1         2         3
    0       NaN       NaN       NaN       NaN
    1  1.895803  0.532464  1.879883 -1.802606
    2  0.078928  0.053323  0.672579 -1.188414
    3       NaN       NaN       NaN       NaN
    4 -0.766554 -0.419646 -0.606505 -0.162188
    
    In [17]: df = df.dropna()
    
    In [18]: df.reset_index(drop=True)
    Out[18]:
              0         1         2         3
    0  1.895803  0.532464  1.879883 -1.802606
    1  0.078928  0.053323  0.672579 -1.188414
    2 -0.766554 -0.419646 -0.606505 -0.162188