Search code examples
rtime-seriesdata-cleaning

R / data cleaning: Separating multiple time series in a data set


I have a data frame containing multiple time series chunks. The chunks don't have identifiers, but the first entry of a chunk is indicated with a boolean variable. How can I use this variable to create identifiers?

Example data:

set.seed(102)
chunks <- data.frame(entry = c(1:50),
                 date = seq(ISOdate(2015,1,1), by = "day", length.out = 50),
                 newchunk = c(1, rbinom(49, 1, .2)),
                 measurement = rnorm(50, 100, 10))

The result should be a new variable "seqID" which groups the chunks. I wondered if the tidyr package can handle this situation.


Solution

  • You can try cumsum

    chunks$seqID <- cumsum(chunks$newchunk)