Search code examples
rdplyr

Group R dataframe by chronological sequence for different events


I have a dataframe that can be simplified to this:

example <- data.frame(
  Subject = c(101, 101, 101, 101, 102, 102, 102, 102),
  event = c('A', 'B', 'B', 'A', 'A', 'B', 'B', 'A'),
  trialtype = c("immneg", "rneg", "immneg", "rneg", "rneg", "immneg", "rneg", "immneg"),
  timing = c(1,2,3,4,5,6,7,8))

I hope to create a new column called ref1 that tracks for each Subject and each event, whether immneg or rneg comes first in trialtype (based on the timing column). The value of ref1 should be 1 if rneg comes first and 0 if immneg comes first.

This is the desired solution:

solution <- data.frame(
  Subject = c(101, 101, 101, 101, 102, 102, 102, 102),
  event = c('A', 'B', 'B', 'A', 'A', 'B', 'B', 'A'),
  trialtype = c("immneg", "rneg", "immneg", "rneg", "rneg", "immneg", "rneg", "immneg"),
  timing = c(1,2,3,4,5,6,7,8),
  ref1 = c(0,1,1,0,1,0,0,1))

Solution

  • I matched your solution df by grouping on Subject and Event and then doing a mutate by going off of the first row in each group

    library(dplyr)
    example %>%
      group_by(Subject,event) %>%
      mutate(ref1=ifelse(first(trialtype) == 'rneg',1,0))