I have an R - dataframe with the following format:
id | date | value |
---|---|---|
1618 | 01/01/23 | 10 |
1618 | 02/01/23 | 10 |
1618 | 03/01/23 | 6 |
1618 | 04/01/23 | 8 |
1618 | 05/01/23 | 10 |
1618 | 06/01/23 | 4 |
3512 | 01/01/23 | 1 |
3512 | 02/01/23 | 1 |
3512 | 03/01/23 | 3 |
3512 | 04/01/23 | 4 |
3512 | 05/01/23 | 0 |
3512 | 06/01/23 | 8 |
I want to do a groupby on the id and add a column with the lowest value so far and a column showing the delta in lowest value. For the example above the expected output looks like:
id | date | value | lowest | delta lowest |
---|---|---|---|---|
1618 | 01/01/23 | 10 | 10 | 0 |
1618 | 02/01/23 | 10 | 10 | 0 |
1618 | 03/01/23 | 6 | 6 | -4 |
1618 | 04/01/23 | 8 | 6 | 0 |
1618 | 05/01/23 | 10 | 6 | 0 |
1618 | 06/01/23 | 4 | 4 | -2 |
3512 | 01/01/23 | 1 | 1 | 0 |
3512 | 02/01/23 | 2 | 1 | 0 |
3512 | 03/01/23 | 3 | 1 | 0 |
3512 | 04/01/23 | 4 | 1 | 0 |
3512 | 05/01/23 | 0 | 0 | -1 |
3512 | 06/01/23 | 8 | 0 | 0 |
How can I best approach this in R?
An approach using cummin
and diff
library(dplyr)
df %>%
mutate(lowest = cummin(value), delta_lowest = c(0, diff(lowest)), .by = id)
id date value lowest delta_lowest
1 1618 01/01/23 10 10 0
2 1618 02/01/23 10 10 0
3 1618 03/01/23 6 6 -4
4 1618 04/01/23 8 6 0
5 1618 05/01/23 10 6 0
6 1618 06/01/23 4 4 -2
7 3512 01/01/23 1 1 0
8 3512 02/01/23 1 1 0
9 3512 03/01/23 3 1 0
10 3512 04/01/23 4 1 0
11 3512 05/01/23 0 0 -1
12 3512 06/01/23 8 0 0
df <- structure(list(id = c(1618L, 1618L, 1618L, 1618L, 1618L, 1618L,
3512L, 3512L, 3512L, 3512L, 3512L, 3512L), date = c("01/01/23",
"02/01/23", "03/01/23", "04/01/23", "05/01/23", "06/01/23", "01/01/23",
"02/01/23", "03/01/23", "04/01/23", "05/01/23", "06/01/23"),
value = c(10L, 10L, 6L, 8L, 10L, 4L, 1L, 1L, 3L, 4L, 0L,
8L)), class = "data.frame", row.names = c(NA, -12L))