Search code examples
rforcats

Reorder factor levels by pattern


I have a factor with that identifies strata within a survey dataset. I want to reorder the factor such that certain character patterns come before other character patterns.

For example, I have this mixed up factor which indicates gender, age, and education:

my_factor <- factor(levels=c(1:8),
                    labels=c("Male-18_34-HS","Female-35_49-HS",
                             "Male-18_34-CG", "Female-18_34-CG",
                             "Male-35_49-HS", "Male-35_49-CG",
                             "Female-18_34-HS", "Female-35_49-CG"),
                    ordered=TRUE)

I'd like this to be ordered with all Female categories first, then the age categories in the correct order, then the education categories in the correct order. I can get most of the way there with forcats::fct_relevel:

forcats::fct_relevel(my_factor, sort)

ordered(0)
8 Levels: Female-18_34-CG < Female-18_34-HS < Female-35_49-CG < Female-35_49-HS < Male-18_34-CG < Male-18_34-HS < ... < Male-35_49-HS

But the education categories are in the wrong order. Is there a way to make sure that "HS" comes before "CG" but leave the order of gender and age groups the same?


Solution

  • You can create your desired factor levels programmatically.

    lvls <- do.call(paste, c(tidyr::expand_grid(
               c('Female', 'Male'), c('18_34', '35_49'), c('HS', 'CG')), sep = '-'))
    lvls
    #[1] "Female-18_34-HS" "Female-18_34-CG" "Female-35_49-HS" "Female-35_49-CG"
    #[5] "Male-18_34-HS"   "Male-18_34-CG"   "Male-35_49-HS"   "Male-35_49-CG"
    

    You can use this lvls as levels in the factor call.