I have a data.frame
representing different time series. In one column, I marked interesting time points (Note: There can be multiple interesting time points per Id):
Id | Time | Value | Interesting |
---|---|---|---|
1 | 0 | 12 | 0 |
1 | 1 | 14 | 0 |
1 | 2 | 11 | 0 |
1 | 3 | 12 | 1 |
1 | 4 | 13 | 0 |
1 | 5 | 14 | 0 |
1 | 6 | 12 | 0 |
1 | 7 | 12 | 0 |
.. | .. | .. | .. |
78 | 128 | 13 |
Now, I would like to mark also n
time points before and m
points afterward as an interesting block. So if n = 2
and m = 3
I would expect this:
Id | Time | Value | Interesting | Block |
---|---|---|---|---|
1 | 0 | 12 | 0 | 0 |
1 | 1 | 14 | 0 | 1 |
1 | 2 | 11 | 0 | 1 |
1 | 3 | 12 | 1 | 1 |
1 | 4 | 13 | 0 | 1 |
1 | 5 | 14 | 0 | 1 |
1 | 6 | 12 | 0 | 1 |
1 | 7 | 12 | 0 | 0 |
.. | .. | .. | .. | .. |
78 | 128 | 13 | 0 | 0 |
At the moment, I use a gaussianSmooth()
and a threshold:
df %>% mutate(Block = ifelse(gaussianSmooth(Interesting, sigma = 4) > 0.001, 1, 0))
But this is cumbersome works and only works if n = m
. Is there a “simpler” solution where I can easily set how many rows before and after should be changed. Solutions preferable in dplyr
/tidyverse
.
With group_modify
(works for multiple Interesting
values too). Get the indices you like: here the position when Interesting == 1
, and then iteratively replace surrounding values with 1
(max(0, i - n):min(nrow(.x), i + m)
).
library(dplyr)
n = 2
m = 3
df %>%
group_by(Id) %>%
group_modify(~ {
idx <- which(.x$Interesting == 1)
for(i in idx){
.x$Interesting[max(0, i - n):min(nrow(.x), i + m)] <- 1
}
.x
})
# A tibble: 8 × 4
# Groups: Id [1]
Id Time Value Interesting
<int> <int> <int> <dbl>
1 1 0 12 0
2 1 1 14 1
3 1 2 11 1
4 1 3 12 1
5 1 4 13 1
6 1 5 14 1
7 1 6 12 1
8 1 7 12 0