Search code examples
rgroup-bygroupingcategorical-data

Creating a new categorical variable based on groups and current categorical variable


I am trying to create a categorical variable based of of groups and current variables.

My current df has the variables: ID, GroupID, and Drinker. I am trying to create a new variable(GroupDrink) to where if any individual(ID) in a group(GroupID) selects yes for Drinker, then all individuals in that group will have a yes for the new variable(GroupDrink). Please see the table below for more details.

ID GroupID Drinker GroupDrink(NewVariable)
1 25 Yes Yes
2 25 No Yes
3 21 No No
4 40 Yes Yes
5 40 No Yes
6 40 No Yes

Solution

  • detach(package:plyr) 
    library(dplyr)
    df %>% group_by(GroupID)%>% 
    mutate(GroupDrink = case_when
    any(Drinker == 'Yes') ~ "Yes",
    TRUE~ "No"
    

    Applying functions to groups seems to work better with the case_when function. For this to work, the plyr package has to be uninstalled. I had to detach the package for the function to apply correctly to groups.