Search code examples
rdplyrforcats

Relabel levels of factors as desired


I would like to relabel the levels of factors as follows:

i. If the level is of length 3 and above relabel it to sentence case otherwise do title case

ie home doing nothing becomes Home doing nothing, yes becomes Yes and good practice becomes Good Practice

Is there any way to do this?

library(tidyverse)

vars <- c("a", "b", "c", "d", "e")


mydata <- tribble(
  ~"a", ~"b", ~"c", ~"d", ~"e", ~"id",
  "yes", "school in Kenya", "r", 10, "good practice", 1,
  "no", "home doing nothing", "python", 12, "doing well", 3,
  "no", "school in Tanzania", "c++", 35, "by walking", 4,
  "yes", "home practising", "c", 65, "practising everyday", 5,
  "no", "home", "java", 78, "sitting alone", 7
) %>%
  mutate(across(.cols = vars, ~as_factor(.)))


# mydata %>%
#   mutate(across(where(is.factor), ~fct_relabel(., str_to_sentence(.))))


Solution

  • Here is one possible solution. Note that the columns you are considering as factor are actually character variables and not factor.

    library(dplyr)
    library(stringr)
    
    mydata %>%
      mutate(across(where(is.character), 
                    ~ if_else(stringi::stri_count_words(.x)>=3, 
                              sub("^([a-z])", "\\U\\1\\E", .x, perl=T), 
                              str_to_title(.x))))
    
    # A tibble: 5 x 6
      a     b                  c          d e                      id
      <chr> <chr>              <chr>  <dbl> <chr>               <dbl>
    1 Yes   School in Kenya    R         10 Good Practice           1
    2 No    Home doing nothing Python    12 Doing Well              3
    3 No    School in Tanzania C++       35 By Walking              4
    4 Yes   Home Practising    C         65 Practising Everyday     5
    5 No    Home               Java      78 Sitting Alone           7