Search code examples
rfor-loopdata-cleaning

R: count pairs of strings in one column and reset count based on different column


I am trying to add a column to a df that holds a count of a pair of strings that are repeated over multiple rows. The count needs to reset based on a change in another column.

More specifically: I am trying to add trial numbers to a very large data frame. Each trial consists of 2 parts (show followed by point), show and point are each associated with a value, and there can be an arbitrary number of show/point values per trial. Each ID can have a different number of trials, but each trial will always have a show, followed by a point. This means each ID will have a different number of rows.

Sample data:

ID <- c(1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2)
TrialType <- c("Show", "Show", "Show", "Point", "Point", "Point", "Point", "Show", "Show", "Show", "Show", "Point", "Show", "Show", "Point", "Show", "Show", "Show", "Point", "Point", "Point", "Show", "Show", "Show", "Show", "Point", "Show", "Show", "Show", "Point", "Point", "Point")
Value <- c(0.52, 0.54, 0.55, 0.57, 0.58, 0.59,0.75,0.89,0.32,0.99,0.01,0.02,0.56,0.67,0.32,0.59,0.75,0.89,0.32,0.99,0.01,0.02,0.56,0.67,0.32,0.55, 0.57, 0.58, 0.59,0.75,0.89, 0.99)
df<-as.data.frame(c(ID, TrialType, Value))
TrialNumber<-c(1,1,1,1,1,1,1,2,2,2,2,2,3,3,3,1,1,1,1,1,1,2,2,2,2,2,3,3,3,3,3,3) 
df.desired <- cbind(ID, TrialType, Value, TrialNumber)

I think I need have a loop that goes through ID, but this is too advanced for me to figure out. I am new to R and to stackoverflow. Thank you in advance for your help.


Solution

  • Using tidyverse:

    within each id, Check whether the current value is Point and the previous value is Show. If that is the case, start a new count.

     library(tidyverse)
     df %>%
         group_by(ID) %>%
         mutate(TrialNumber = TrialType == 'Show' &
                    lag(TrialType, default = 'Point') == 'Point',
                TrialNumber = cumsum(TrialNumber))
    
       ID TrialType Value TrialNumber
    1   1      Show  0.52           1
    2   1      Show  0.54           1
    3   1      Show  0.55           1
    4   1     Point  0.57           1
    5   1     Point  0.58           1
    6   1     Point  0.59           1
    7   1     Point  0.75           1
    8   1      Show  0.89           2
    9   1      Show  0.32           2
    10  1      Show  0.99           2
    11  1      Show  0.01           2
    12  1     Point  0.02           2
    13  1      Show  0.56           3
    14  1      Show  0.67           3
    15  1     Point  0.32           3
    16  2      Show  0.59           1
    17  2      Show  0.75           1
    18  2      Show  0.89           1
    19  2     Point  0.32           1
    20  2     Point  0.99           1
    21  2     Point  0.01           1
    22  2      Show  0.02           2
    23  2      Show  0.56           2
    24  2      Show  0.67           2
    25  2      Show  0.32           2
    26  2     Point  0.55           2
    27  2      Show  0.57           3
    28  2      Show  0.58           3
    29  2      Show  0.59           3
    30  2     Point  0.75           3
    31  2     Point  0.89           3
    32  2     Point  0.99           3