Subsetting a large dataframe leaves us with a factor variable that needs reordering and dropping of missing factors. A reprex is below:
library(tidyverse)
set.seed(1234)
data <- c("10th Std. Pass", "11th Std. Pass", "12th Std. Pass", "5th Std. Pass",
"6th Std. Pass", "Diploma / certificate course", "Graduate", "No Education")
education <- factor(sample(data, size = 5, replace = TRUE),
levels = c(data, "Data not available"))
survey <- tibble(education)
The code further below, as per this answer, achieves what we want but we'd like to integrate the reordering and dropping of factors into our piped recoding of the survey.
recoded_s <- survey %>% mutate(education =
fct_collapse(education,
"None" = "No Education",
"Primary" = c("5th Std. Pass", "6th Std. Pass"),
"Secondary" = c("10th Std. Pass", "11th Std. Pass", "12th Std. Pass"),
"Tertiary" = c("Diploma / certificate course", "Graduate")
))
recoded_s$education
#> [1] Secondary Primary Primary Primary Tertiary
#> Levels: Secondary Primary Tertiary None Data not available
# Re-ordering and dropping variables
factor(recoded_s$education, levels = c("None", "Primary", "Secondary", "Tertiary"))
#> [1] Secondary Primary Primary Primary Tertiary
#> Levels: None Primary Secondary Tertiary
Any pointers would be much appreciated!
I'm not sure I understand. Could you elaborate why wrapping everything inside a mutate
call doesn't suffice?
library(tidyverse)
library(forcats)
survey %>%
mutate(
education = fct_collapse(
education,
"None" = "No Education",
"Primary" = c("5th Std. Pass", "6th Std. Pass"),
"Secondary" = c("10th Std. Pass", "11th Std. Pass", "12th Std. Pass"),
"Tertiary" = c("Diploma / certificate course", "Graduate")),
education = factor(education, levels = c("None", "Primary", "Secondary", "Tertiary")))
dplyr::recode
lvls <- list(
"No Education" = "None",
"5th Std. Pass" = "Primary",
"6th Std. Pass" = "Primary",
"10th Std. Pass" = "Secondary",
"11th Std. Pass" = "Secondary",
"12th Std. Pass" = "Secondary",
"Diploma / certificate course" = "Tertiary",
"Graduate" = "Tertiary")
survey %>%
mutate(
education = factor(recode(education, !!!lvls), unique(map_chr(lvls, 1))))