Search code examples
rcountmultiple-columns

Count consecutive occurrences of a value in rows of R data frame


I would like to do count the consecutive occurrences in rows of R data frame whenever the first column takes that value and save the results in a new variable. If this was my data and I was interested in values of one:

df <- data.frame(
  A = c(1, 0, 1, 0, 1),
  B = c(0, 0, 1, 1, 1),
  C = c(1, 1, 0, 1, 1),
  D = c(0, 1, 0, 0, 1),
  E = c(1, 0, 1, 0, 0)
)

I would want to create the following output:

df <- data.frame(
  A = c(1, 0, 1, 0, 1),
  B = c(0, 0, 1, 1, 1),
  C = c(1, 1, 0, 1, 1),
  D = c(0, 1, 0, 0, 1),
  E = c(1, 0, 1, 0, 0),
  count = c(1, 0, 2, 0, 4)
)

I tried something like this but am not really certain if this is sensible:

df$count <- apply(df[, sapply(df, is.numeric)], 1, function(x) {
  r <- rle(x == 1)
  max(r$lengths[r$values])
})

And it also does not yet take into account that I am interested in spells starting at the first column. Any help is much appreciated!


Solution

  • Using rle

    cbind(df, count = apply(df, 1, function(x) 
      ifelse(x[1] == 1, max(rle(x)$lengths), 0)))
      A B C D E count
    1 1 0 1 0 1     1
    2 0 0 1 1 0     0
    3 1 1 0 0 1     2
    4 0 1 1 0 0     0
    5 1 1 1 1 0     4