Search code examples
rfactors

R, Label factors columns using character objects


I have a data.frame with columns that needs to be labeled.

df <- structure(list(q1 = c("5", "6", "5", "5", "7", "5", "5", "5", 
"5", "6", "5", "6", "6", "6", "7", "6", "5", "6", "5", "6", "6", 
"5", "7", "5", "6", "6", "5", "6", "6", "5", "5", "5", "5", "5", 
"5", "5", "4", "5", "5", "4", "4", "5", "4", "4", "5", "4", "5", 
"5", "4", "5"), q2 = c("2", "2", "1", "1", "2", "1", "1", "2", 
"1", "1", "1", "2", "1", "1", "2", "1", "2", "1", "2", "2", "2", 
"1", "2", "2", "2", "2", "2", "1", "1", "2", "2", "2", "2", "2", 
"2", "1", "2", "1", "1", "1", "1", "2", "1", "1", "1", "1", "2", 
"2", "1", "2"), q3 = c("3", "3", "3", "3", "3", "3", "3", "3", 
"3", "3", "3", "3", "3", "3", "3", "3", "3", "3", "3", "3", "3", 
"3", "3", "3", "3", "3", "3", "3", "3", "3", "3", "3", "3", "3", 
"3", "2", "2", "2", "2", "2", "2", "2", "2", "2", "2", "2", "2", 
"2", "2", "2")), row.names = c(NA, -50L), class = c("tbl_df", 
"tbl", "data.frame"), na.action = structure(c(`71` = 71L, `78` = 78L, 
`96` = 96L, `250` = 250L, `393` = 393L, `488` = 488L, `644` = 644L, 
`847` = 847L, `862` = 862L, `1083` = 1083L, `1120` = 1120L, `1149` = 1149L, 
`1322` = 1322L, `1357` = 1357L), class = "omit"))

Each column needs a different label. Those labels are in separated objects as shown below.

Q1_Label <- c("12 y",  "13 y",  "14 y", "15 y",  "16 y",  "17 y",  "18 y")

Q2_Label <-  c("Female", "Male" )

Q3_Label <- c("9th", "10th", "11th",  "12th", "Ung"

How can I label the data frame columns using the character objects in the less line of code as possible?

Below is a code that tries to do that but I can not get the name of the sapply structure.

Thanks in advance for your help.

a_df <- sapply(X = df, FUN = function(x) factor(x, 
                   levels = 1:length(table(x)), 
                   labels = get(paste(toupper(names(x)), "_Label", sep = "")) # This line is where I get the problem
))

Solution

  • We can use put all the labels in a list and change the values in column using factor.

    df[] <- Map(function(x, y) factor(x, labels = y[1:length(unique(x))]),
                df,mget(ls(pattern = "Q\\d+_Label")))
    df
    # A tibble: 50 x 3
    #   q1    q2     q3   
    #   <fct> <fct>  <fct>
    # 1 13 y  Male   10th 
    # 2 14 y  Male   10th 
    # 3 13 y  Female 10th 
    # 4 13 y  Female 10th 
    # 5 15 y  Male   10th 
    # 6 13 y  Female 10th 
    # 7 13 y  Female 10th 
    # 8 13 y  Male   10th 
    # 9 13 y  Female 10th 
    #10 14 y  Female 10th 
    # … with 40 more rows