I have the following dataframe called df,
date flag1 flag2 flag3 flag4…
2020-12-31
2021-01-01
2021-01-02 1
2021-01-03
2021-01-04
2021-01-05 1
2021-01-06 1
2021-01-07
2021-01-08
2021-01-09
2021-01-10
2021-01-11 1 1
2021-01-12
I want to do a backfill when a 1 appears in any column, and fill backwards until a number appears or failing that, backfill til a set number.
So let’s say the set number to reduce o to is 0 and the decrement is 0.1, it should look like this,
date flag1 flag2 flag3 flag4…
2020-12-31 0.5 0.8 0.4
2021-01-01 0.0 0.6 0.9 0.5
2021-01-02 0.1 0.7 1.0 0.6
2021-01-03 0.2 0.8 0.7
2021-01-04 0.3 0.9 0.8
2021-01-05 0.4 1.0 0.9
2021-01-06 0.5 1.0
2021-01-07 0.6 0.6
2021-01-08 0.7 0.7
2021-01-09 0.8 0.8
2021-01-10 0.9 0.9
2021-01-11 1.0 1.0
2021-01-12
Can this be achieved with pandas? I want to be able to set the decrement amount and the limit for example the above would be 0.1 and 0.
I know that this command can increment the values backwards:
df1 = df1[::-1].fillna(method='ffill')
(df1 + (df1 == df1.shift()).cumsum()).sort_index()
But that’s not what I want.
You could also try using iloc
to change the values based on the indices where the column value is equals to 1.0:
import pandas as pd
import numpy as np
def process_data(c, n):
for idx in reversed(np.where(c==1)[0]):
c.iloc[np.arange(idx)[::-1][:n.shape[0]]] = n[idx-1::-1][::-1]
c.iat[idx] = 1.0
return c
df = df.apply(lambda r: process_data(r, np.linspace(1.0, 0.0, num=11)[1:]))
flag1 flag2 flag3 flag4
date
2020-12-31 NaN 0.5 0.8 0.4
2021-01-01 0.0 0.6 0.9 0.5
2021-01-02 0.1 0.7 1.0 0.6
2021-01-03 0.2 0.8 NaN 0.7
2021-01-04 0.3 0.9 NaN 0.8
2021-01-05 0.4 1.0 NaN 0.9
2021-01-06 0.5 NaN NaN 1.0
2021-01-07 0.6 NaN NaN 0.6
2021-01-08 0.7 NaN NaN 0.7
2021-01-09 0.8 NaN NaN 0.8
2021-01-10 0.9 NaN NaN 0.9
2021-01-11 1.0 NaN NaN 1.0
2021-01-12 NaN NaN NaN NaN