Search code examples
rdatabaseanalysis

Analysis of many attendance lists


I have 8 attendace lists from 8 different conferences. I need to know what persons assisted to at least 7 of the 8 conferences. I don't want to do it checking name by name in each list, so I'm planning to do it using R, but I have no clue about it. Any suggestions?


Solution

  • Might be a more simple way (my R is getting a bit rusty), but this works:

    library(dplyr)
    unique_attendees <- c('a', 'b', 'c', 'd', 'e')
    
    conf1_attendees <- c('a','b')
    conf2_attendees <- c('a','b','c')
    conf3_attendees <- c('a','b','c','e')
    conf4_attendees <- c('b', 'e')
    conf5_attendees <- c('a','d', 'e')
    conf6_attendees <- c('a','d', 'e')
    conf7_attendees <- c('a','b', 'e')
    conf8_attendees <- c('a','b', 'c')
    
    conferences <- list(conf1_attendees, conf2_attendees, conf3_attendees, conf4_attendees, conf5_attendees, conf6_attendees, conf7_attendees,conf8_attendees)
    
    attendance_record <- dplyr::bind_rows(lapply(unique_attendees, function(x){
      cat(c('Working with: ', x, '\n'))
      attendance <- lapply(conferences, function(y){
        attended <- grepl(x, y)
        return(attended)
      })
      number_attended = length(which(unlist(attendance) == TRUE))
      result <- data.frame(person=x, number_attended=number_attended)
    }))
    
    result <- attendance_record %>% 
      mutate(attended_at_least_7 = data.table::fifelse(number_attended >= 7, TRUE, FALSE))
    
    print(result)
    

    Output:

      person number_attended attended_at_least_7
    1      a               7                TRUE
    2      b               6               FALSE
    3      c               3               FALSE
    4      d               2               FALSE
    5      e               5               FALSE
    

    Obviously you'll need to adapt it to your problem since we don't know how your records are stored.