in sql request, i have a chance to sum with lagged param on the fly, like:
CASE
WHEN dt_rep.fcr = 0::numeric THEN 0::numeric
ELSE (dt_rep.fcr - lag(dt_rep.fcr) OVER (PARTITION BY dt_rep.place_id ORDER BY dt_rep.bucket)) / dt_rep.fcr
END AS cr
here, I order the table by bucket (it is timeframe) and sorted them by place_id (to let group by place_id for lagging). Then, i can simply filtered the table with values of cr is null what gives me required structure.
in pandas, I can shift and assign new column like:
data_fin['fcr_b'] = data_fin.groupby(['place_id'])['fcr'].shift(1)
then I need to combine columns 'fcr_b' and 'fcr' to new one 'cr' and filter dataframe where fcr_b is NaN.
Is there more accurate approache to get new column 'cr' something like:
data_fin[cr] = data_fin['fcr'].apply(lambda x: x + lag(x).groupby('place_id'))
<= of cause, it is not working example!
Something like this?
df = pd.DataFrame({"A": [1, 1, 1, 2, 2, 2, 3, 3, 3],
"B": ["a", "b", "c", "d", "e", "f", "g", "h", "i"],
"C": ['text', 'text', 'text', 'text', 'text', 'text', 'text', 'text', 'text']})
df['D'] = df.groupby('A', as_index=False)['B'].shift(1).add(df['C'], fill_value=0)
df