Search code examples
rdplyrdata-cleaning

Assigning new variable if a column takes specific values


I am trying to generate a new variable to identify 'single parents' in a household, based on a group identifier. If there is a 'Child' in a group without both a 'Head' and "Spouse', I would like the variable to take the value of 1. I have tried using dplyr but am unable to arrive at the solution.

relation<-c("Head","Spouse","Child","Head","Spouse","Head","Child")
group<-c(1,1,1,2,2,3,3)
my_data<-as.data.frame(cbind(group,relation))

my_data %>%
  group_by(group) %>%
  mutate(single_parent = case_when(relation %in% "Child" & !(relation %in% "Head" & relation %in% "Spouse")~1))

# desired output
my_data$single_parent<-c(0,0,0,0,0,1,1)

Thank you for your help.


Solution

  • We could do

    library(dplyr)
    my_data <- my_data %>% 
      group_by(group) %>% 
      mutate(single_parent =  +((!all(c("Head", "Spouse") %in% relation & 
         'Child' %in% relation)) & 'Child' %in% relation)) %>%
      ungroup
    

    -output

    my_data
    # A tibble: 7 × 3
      group relation single_parent
      <dbl> <chr>            <int>
    1     1 Head                 0
    2     1 Spouse               0
    3     1 Child                0
    4     2 Head                 0
    5     2 Spouse               0
    6     3 Head                 1
    7     3 Child                1
    

    data

    my_data <- data.frame(group, relation)