I have a table similar to this minimal example without the difference column:
trigger | values | difference |
---|---|---|
0 | 3 | |
0 | NA | |
0 | NA | |
1 | 5 | 2 |
0 | 4 | |
0 | NA | |
1 | 10 | 6 |
I want to subtract the above number (and leave out the NAs) from the number at each trigger point (trigger = 1)
Is there a way to do this in R?
Edit:
I have now the situation where the triggers lie close together like in this example:
trigger | values | difference |
---|---|---|
0 | 3 | |
0 | NA | |
0 | NA | |
1 | 5 | 2 |
0 | 4 | |
1 | 5 | 1 |
0 | 10 |
How can I tackle this problem?
Create a grouping column with cumsum
on the 'trigger' and taking the lag
, then do the difference between the first and last element and replace
it as the last
value per group
library(dplyr)
df1 %>%
group_by(grp = lag(cumsum(trigger), default = 0)) %>%
mutate(difference = replace(rep(NA, n()), n(),
values[n()] - values[1])) %>%
ungroup %>%
select(-grp)
-output
# A tibble: 7 × 3
trigger values difference
<int> <int> <int>
1 0 3 NA
2 0 NA NA
3 0 NA NA
4 1 5 2
5 0 4 NA
6 0 NA NA
7 1 10 6
For the second case, we may need a condition with if/else
that checks the number of rows i.e. if
the number of rows is greater than 1 only need the computation to replace
df2 %>%
group_by(grp = lag(cumsum(trigger), default = 0)) %>%
mutate(difference = if(n() > 1) replace(rep(NA, n()), n(),
values[n()] - values[1]) else NA) %>%
ungroup
-output
# A tibble: 7 × 4
trigger values grp difference
<int> <int> <dbl> <int>
1 0 3 0 NA
2 0 NA 0 NA
3 0 NA 0 NA
4 1 5 0 2
5 0 4 1 NA
6 1 5 1 1
7 0 10 2 NA
df1 <- structure(list(trigger = c(0L, 0L, 0L, 1L, 0L, 0L, 1L), values = c(3L,
NA, NA, 5L, 4L, NA, 10L)), class = "data.frame", row.names = c(NA,
-7L))
df2 <- structure(list(trigger = c(0L, 0L, 0L, 1L, 0L, 1L, 0L), values = c(3L,
NA, NA, 5L, 4L, 5L, 10L)), class = "data.frame", row.names = c(NA,
-7L))