I replicated a Pandas series with the following code:
data = np.array([1, 2, 3, 4, 5, np.nan, np.nan, np.nan, 9,10,11,12,13,14])
ser = pd.Series(data)
print(ser)
I would like to select only the columns before the NaN values so that I only get the values 1,2,3,4,5. How should I do that?
Test missing values with Series.isna
and add Series.cummax
for repeat True
s after first match and last invert mask by ~
, filter in boolean indexing
:
a = ser[~ser.isna().cummax()]
print(a)
0 1.0
1 2.0
2 3.0
3 4.0
4 5.0
dtype: float64
Alternative solution with cumulative sum:
a = ser[ser.isna().cumsum().eq(0)]
print(a)
0 1.0
1 2.0
2 3.0
3 4.0
4 5.0
dtype: float64
Details:
print(ser.to_frame().assign(testna = ser.isna(),
cummax = ser.isna().cumsum(),
invert = ser.isna().cumsum().eq(0)))
0 testna cummax invert
0 1.0 False 0 True
1 2.0 False 0 True
2 3.0 False 0 True
3 4.0 False 0 True
4 5.0 False 0 True
5 NaN True 1 False
6 NaN True 2 False
7 NaN True 3 False
8 9.0 False 3 False
9 10.0 False 3 False
10 11.0 False 3 False
11 12.0 False 3 False
12 13.0 False 3 False
13 14.0 False 3 False
print(ser.to_frame().assign(testna = ser.isna(),
cummax = ser.isna().cummax(),
test0 = ~ser.isna().cummax()))
0 testna cummax test0
0 1.0 False False True
1 2.0 False False True
2 3.0 False False True
3 4.0 False False True
4 5.0 False False True
5 NaN True True False
6 NaN True True False
7 NaN True True False
8 9.0 False True False
9 10.0 False True False
10 11.0 False True False
11 12.0 False True False
12 13.0 False True False
13 14.0 False True False