Search code examples
rdatabasefactors

How to recode a three level factor


My dataset is one which contains information about various children. I have a complicated factor question. I have two variables: Parent 1 finance and Parent 2 finance (which take one of three values: low, medium, high income). I want to make a third variable, "guardian finance" as some of my subjects only have one parent. How can I recode it so that the highest level of finance is selected, and, if the child is from a one parent household, this is carried over to the new, "Guardian" variable.

       p1        n
      <int>    <int>
1      low     100
2      medium  306
3      high    96
        p2            n
       <int>       <int>
1      low         227
2      medium      230
3      high        243

Solution

  • If we want to get the highest 'n' among the the two datsets, do a join between the two columns by the 'p' columnss and then use pmax to return the max between the 'n' columns

    library(dplyr)
    inner_join(df1, df2, by = c("p1" = "p2") %>%
         mutate(n = pmax(n.x, n.y) %>%
         select(p1, n)