Search code examples
rdplyrtidyversedata-manipulationcounting

Counting consecutive columns that satisfy a condition in R


I'm working in RStudio with my dataset that contains the daily milk production of cows reared in different locations. Also, I have 5 columns with the temperature in the corresponding 5 previous days to the milking control day.

I would like to count the number of consecutive days ion which the temperature overcome a certain threshold (ex. 30ºC) independently of the position of these days (3 consecutive days could occur during the 3,4,5 days previous to milking control for example). Also, in case that during this 5 days an event with for example, 1 days, and another event with 3 consecutive days occur, I need to take into account the higher number.

Here is a toy dataset that schematically reflects my dataset. How can I count in R the consecutive days that satisfy my conditions?

data <- data.frame(cow=1:5, milk=c(35,36,36,35,34), 
       day1ago=c(27,28,20,24,33), 
       day2ago=c(25,25,32,31,28),
       day3ago=c(22,31,25,31,29),
       day4ago=c(28,33,32,33,28),
       day5ago=c(29,28,33,34,31))

To these toy dataset, I would expect to obtain a vector like this:

data$consecutive_days = c(0,2,2,4,1)

Solution

  • You could use rle on a row-wise data frame:

    library(dplyr)
    
    data |>
      rowwise() |>
      mutate(rle = list(rle(c_across(starts_with("day")) > 30)),
             consecutive_days = with(rle, ifelse(any(values), max(lengths[values]), 0))) |>
      ungroup() |>
      select(-rle)
    

    rle computes runs of equal values. In this case runs of TRUE/FALSE if the value is above or below the threshold. We use this to find the longest (max) length (lengths) of TRUE values.

    Output

        cow  milk day1ago day2ago day3ago day4ago day5ago consecutive_days
      <int> <dbl>   <dbl>   <dbl>   <dbl>   <dbl>   <dbl>            <dbl>
    1     1    35      27      25      22      28      29                0
    2     2    36      28      25      31      33      28                2
    3     3    36      20      32      25      32      33                2
    4     4    35      24      31      31      33      34                4
    5     5    34      33      28      29      28      31                1