Search code examples
rdataframevectordplyrlogical-operators

Trying to find occurrences of ID that meets sequential conditions in R


I'm trying to return a logical vector based on whether a person meets one set of conditions and ALSO meets another set of conditions later on. I'm using a data frame that looks like so:

Person.Id     Year       Term

250             1         3
250             1         1
250             2         3  
300             1         3           
511             2         1
300             1         5
700             2         3

What I want to return is a logical vector that indicates true/false if person ID 250 has year 1 and term 3, AND later has year 2 term 3. So a person that only has year 1 term 3 or year 1 term 5 will return false. Solutions in dplyr preferred! I feel like this is simple and I'm just missing something. I initially tried this code but all it returned was a blank df:

df2 <- df1 %>%
        group_by(Person.Id) %>%
        filter((year==1 & term==3) & (year==2 & term==3)) 

Solution

  • Are you looking for something like this ?

    require(dplyr)
    
    df %>% 
      group_by(Person.Id) %>% 
      mutate(count=sum((year==1 & term==3) | (year==2 & term==3))) %>% 
      mutate(count2=if_else(count==2,T,F))
    
    # A tibble: 7 x 5
    # Groups:   Person.Id [4]
      Person.Id  year  term count count2
          <int> <int> <int> <int> <lgl> 
    1       250     1     3     2 TRUE  
    2       250     1     1     2 TRUE  
    3       250     2     3     2 TRUE  
    4       300     1     3     1 FALSE 
    5       511     2     1     0 FALSE 
    6       300     1     5     1 FALSE 
    7       700     2     3     1 FALSE