Suppose I have the following DataFrame
index Value
0 'A'
1 'B'
2 pd.NA
3 pd.NA
4 'C'
5 pd.NA
Then, I'm looking for a function that returns the next two DataFrames:
First:
index Value
0 'A'
1 'B'
Second:
index Value
4 'C'
You can build a mask of NAs (with isna
), form groups of successive non-NAs with cumsum
and split with groupby
:
m = df['Value'].isna()
dfs = [g for _,g in df[~m].groupby(m.cumsum())]
Output:
[ index Value
0 0 A
1 1 B,
index Value
4 4 C]
Intermediates:
index Value isna ~ cumsum
0 0 A False True 0
1 1 B False True 0
2 2 NaN True False 1
3 3 NaN True False 2
4 4 C False True 2
5 5 NaN True False 3