I have DataFrame with columns col1, col2, col3
. I want to create another col4
which contains col2[n+3]/col2-1
for each group in col1
separately:
|col1 | col2 | col3|
|-----|------|-----|
| A | 2 | 4 |
| A | 4 | 5 |
| A | 7 | 7 |
| A | 3 | 8 |
| A | 7 | 3 |
| B | 8 | 9 |
| B | 10 | 10 |
| B | 8 | 9 |
| B | 20 | 15 |
The output should be:
|col1 | col2 | col3| col4|
|-----|------|-----|-----|
| A | 2 | 4 | 0.5| # (3/2-1)
| A | 4 | 5 | 0.75| # (7/4-1)
| A | 7 | 7 | NA |
| A | 3 | 8 | NA |
| A | 7 | 3 | NA |
| B | 8 | 9 | 1.5 |
| B | 10 | 10 | NA |
| B | 8 | 9 | NA |
| B | 20 | 15 | NA |
My code is:
df['col4']= df.groupby('col1').apply(lambda x: x['col2'].shift(-3)/x['col2']-1)
which results in col4
with all entries 'NA'.
I also tried:
df['col4']= df.groupby('col1').pipe(lambda x: x['col2'].shift(-3)/x['col2']-1)
which ignores groups 'A' and 'B' and results in:
|col1 | col2 | col3| col4 |
|-----|------|-----|-------|
| A | 2 | 4 | 0.5 |
| A | 4 | 5 | 0.75 |
| A | 7 | 7 | 0.1428|
| A | 3 | 8 | 2.33 |
| A | 7 | 3 | 0.1428|
| B | 8 | 9 | 1.5 |
| B | 10 | 10 | NA |
| B | 8 | 9 | NA |
| B | 20 | 15 | NA |
Does anyone know how to do this task or fix my code?
IIUC:
df['col4'] = df.groupby('col1')['col2'].transform(lambda x: x.shift(-3)) / df['col2'] - 1
Output:
col1 col2 col3 col4
0 A 2 4 0.50
1 A 4 5 0.75
2 A 7 7 NaN
3 A 3 8 NaN
4 A 7 3 NaN
5 B 8 9 1.50
6 B 10 10 NaN
7 B 8 9 NaN
8 B 20 15 NaN
Use transform
to shift your 'col2' within each group then divide by 'col2' and subtract 1.