Search code examples
pythonpandasdata-scienceseries

How to delete rows in Pandas after a certain value?


I replicated a Pandas series with the following code:

data = np.array([1, 2, 3, 4, 5, np.nan, np.nan, np.nan, 9,10,11,12,13,14])
  
ser = pd.Series(data)
print(ser)

I would like to select only the columns before the NaN values so that I only get the values 1,2,3,4,5. How should I do that?


Solution

  • Test missing values with Series.isna and add Series.cummax for repeat Trues after first match and last invert mask by ~, filter in boolean indexing:

    a = ser[~ser.isna().cummax()]
    print(a)
    0    1.0
    1    2.0
    2    3.0
    3    4.0
    4    5.0
    dtype: float64
    

    Alternative solution with cumulative sum:

    a = ser[ser.isna().cumsum().eq(0)]
    print(a)
    0    1.0
    1    2.0
    2    3.0
    3    4.0
    4    5.0
    dtype: float64
    

    Details:

    print(ser.to_frame().assign(testna = ser.isna(), 
                                cummax = ser.isna().cumsum(),
                                invert = ser.isna().cumsum().eq(0)))
           0  testna  cummax  invert
    0    1.0   False       0    True
    1    2.0   False       0    True
    2    3.0   False       0    True
    3    4.0   False       0    True
    4    5.0   False       0    True
    5    NaN    True       1   False
    6    NaN    True       2   False
    7    NaN    True       3   False
    8    9.0   False       3   False
    9   10.0   False       3   False
    10  11.0   False       3   False
    11  12.0   False       3   False
    12  13.0   False       3   False
    13  14.0   False       3   False
    

    print(ser.to_frame().assign(testna = ser.isna(), 
                                cummax = ser.isna().cummax(),
                                test0 = ~ser.isna().cummax()))
    
           0  testna  cummax   test0
    0    1.0   False   False    True
    1    2.0   False   False    True
    2    3.0   False   False    True
    3    4.0   False   False    True
    4    5.0   False   False    True
    5    NaN    True    True   False
    6    NaN    True    True   False
    7    NaN    True    True   False
    8    9.0   False    True   False
    9   10.0   False    True   False
    10  11.0   False    True   False
    11  12.0   False    True   False
    12  13.0   False    True   False
    13  14.0   False    True   False