Search code examples
rdataframesumintervals

Calculate the length of an interval if data are equal to zero


I have a dataframe with time points and the corresponding measure of the activity in different subjects. Each time point it's a 5 minutes interval.

time        Subject1    Subject2
06:03:00    6,682129    8,127075
06:08:00    3,612061    20,58838
06:13:00    0           0
06:18:00    0,9030762   0
06:23:00    0           0
06:28:00    0           0
06:33:00    0           0
06:38:00    0           7,404663
06:43:00    0           11,55835
...

I would like to calculate the length of each interval that contains zero activity, as the example below:

             Subject 1    Subject 2
Interval_1   1            5
Interval_2   5            

I have the impression that I should solve this using loops and conditions, but as I am not so experienced with loops I do not know where to start. Do you have any idea to solve this? Any help is really appreciated!


Solution

  • You can use rle() to find runs of consecutive values and the length of the runs. We need to filter the results to only runs where the value is 0:

    result = lapply(df[-1], \(x) with(rle(x), lengths[values == 0]))
    result
    # $Subject1
    # [1] 1 5
    # 
    # $Subject2
    # [1] 5
    

    As different subjects can have different numbers of 0-runs, the results make more sense in a list than a rectangular data frame.