Apparently this is a more complicated problem than I thought. All I want to do is fill the zeros with ++1 increments until the next 1
My dataset is 1m+ rows, so I'm trying to vectorize this operation if possible.
Here's a sample column:
# Define the input dataframe
df = pd.DataFrame({'col': [1, 0, 1, 0, 1, 1, 0, 0, 0, 0, 1, 0, 1, 1, 0]})
0 1
1 0
2 1
3 0
4 1
5 1
6 0
7 0
8 0
9 0
10 1
11 0
12 1
13 1
14 0
Goal Result:
0 1
1 2
2 1
3 2
4 1
5 1
6 2
7 3
8 4
9 5
10 1
11 2
12 1
13 1
14 2
I've tried a number of different methods with ffill() and cumsum(), but the issue with cumsum() tends to be that it doesn't reset the increment.
Group by cumulative sums of column col
and apply cumcount
:
df['col'] = df.groupby(df['col'].cumsum())['col'].cumcount() + 1
col
0 1
1 2
2 1
3 2
4 1
5 1
6 2
7 3
8 4
9 5
10 1
11 2
12 1
13 1
14 2