I have a table as such
+------+------------+-------+
| Idx | date | value |
+------+------------+-------+
| A | 20/11/2016 | 10 |
| A | 21/11/2016 | 8 |
| A | 22/11/2016 | 12 |
| B | 20/11/2016 | 16 |
| B | 21/11/2016 | 18 |
| B | 22/11/2016 | 11 |
+------+------------+-------+
I'd like to create a column that creates a new column 'rolling_quantile_value' based on the column 'value' that calculates a quantile based on the past for each row and each possible Idx.
For the example above, if the quantile chosen is median, the output should look like this :
+------+------------+-------+-----------------------+
| Idx | date | value | rolling_median_value |
+------+------------+-------+-----------------------+
| A | 20/11/2016 | 10 | NaN |
| A | 21/11/2016 | 8 | 10 |
| A | 22/11/2016 | 12 | 9 |
| A | 23/11/2016 | 14 | 10 |
| B | 20/11/2016 | 16 | NaN |
| B | 21/11/2016 | 18 | 16 |
| B | 22/11/2016 | 11 | 17 |
+------+------------+-------+-----------------------+
I've done it the naive way where I just put a function that creates row by row based on precedents rows of value and flags the jump from one Id to another but I'm sure that it's not the most efficient way to do that, nor the most elegant.
Looking forward to your suggestions !
I think you want expanding
df['rolling_median_value']=(df.groupby('Idx',sort=False)
.expanding(1)['value']
.median()
.groupby(level=0)
.shift()
.reset_index(drop=True))
print(df)
Idx date value rolling_median_value
0 A 20/11/2016 10 NaN
1 A 21/11/2016 8 10.0
2 A 22/11/2016 12 9.0
3 A 23/11/2016 14 10.0
4 B 20/11/2016 16 NaN
5 B 21/11/2016 18 16.0
6 B 22/11/2016 11 17.0
UPDATE
df['rolling_quantile_value']=(df.groupby('Idx',sort=False)
.expanding(1)['value']
.quantile(0.75)
.groupby(level=0)
.shift()
.reset_index(drop=True))
print(df)
Idx date value rolling_quantile_value
0 A 20/11/2016 10 NaN
1 A 21/11/2016 8 10.0
2 A 22/11/2016 12 9.5
3 A 23/11/2016 14 11.0
4 B 20/11/2016 16 NaN
5 B 21/11/2016 18 16.0
6 B 22/11/2016 11 17.5