Search code examples
rdemographics

Creating a new column in a dataframe based on the answer choices in the other columns


I'm a bit confused on how to populate my new column based on character combinations I have from each of my other columns.

Here is my original dataframe:

df <-  data.frame('Hispanic'=c("N", "Y", "N", "N"), 'Black'=c("Y", "N", "N", "Null"), 'Asian'=c("N", "Y", "N", "N"), 
                  'HN'=c("N", "N", "N", "N"), 'AN'=c("N", "N", "N", "Y"), 'White'=c("N", "Y", "N", "Null"), 
                  'NA'=c("N", "N", "Y", "Y"))

I want to code the variables in the new column based on different combinations of race and ethnicity. Specifically I'm trying to get these factors into the categories of Black (Non-Hispanic), Asian (Non-Hispanic), Native Hawaiian (Non-Hispanic), American Indian/Alaska Native (Non-Hispanic), Multiracial (Non-Hispanic) and Hispanic. So whenever a record has Hispanic as a yes, the populated value should just be Hispanic but if the value is a no it should detail either the single race selected with Non-Hispanic (ex: Black, NH) or if they selected more than one race it would be multiracial and Non-Hispanic (Ex: Multiracial, NH).

The goal is to get something that looks like the results below:

df1 <- data.frame('Hispanic'=c("N", "Y", "N", "N"), 'Black'=c("Y", "N", "N", "Null"), 'Asian'=c("N", "Y", "N", "N"), 
                  'HN'=c("N", "N", "N", "N"), 'AN'=c("N", "N", "N", "Y"), 'White'=c("N", "Y", "N", "Null"), 
                  'NA'=c("N", "N", "Y", "Y"), 
                  'R_E'=c("Black, NH", "Hispanic", "Native American, NH", "Multi-racial, NH" )) 

Solution

  • df %>%
      rowid_to_column() %>%
      left_join(pivot_longer(.,-rowid) %>%
        group_by(rowid) %>%
        mutate(value = value == 'Y') %>%
        summarise(value = if(any(name =='Hispanic' & value))
          'Hispanic' else paste(if (sum(value)>1)
          'multiracial' else name[value], 'NH')))
    
          rowid Hispanic Black Asian HN AN White NA.          value
    1     1        N     Y     N  N  N     N   N       Black NH
    2     2        Y     N     Y  N  N     Y   N       Hispanic
    3     3        N     N     N  N  N     N   Y         NA. NH
    4     4        N  Null     N  N  Y  Null   Y multiracial NH