Search code examples
rlabellookup-tables

How to use lookup table to label columns in R with clean variable names?


I'm trying to figure out the best way to apply labels to my columns in R. One coworker recommended a lookup table and gave me some starter code to do that, but don't understand how to actually use the clean variable names when I'm creating figures or tables.

Here's a sample df (the actual one I'm working with is quite large), my current code for the lookup table, and a couple examples of figures/tables I'm creating:

      #Creating sample df
        x <- c("A", "B", "C")
        y <- c(1, 2, 3)
        df <- data.frame("var1" = x, "var2" = y)
      
      #Creating lookup table  
        vars <- c("var1", "var2")
        vars_clean <- c("Var 1", "Var 2")
        names(vars_clean) <- vars
        
        tibble(a = c("var1", "var2")) %>%
          mutate(a_clean = vars_clean[a]) -> lookup_tibble

      #Example figure 
        ggplot(data=df, aes(var1))+
          geom_bar()   
        
      #Example table
        CreateTableOne(vars=vars, data=df)

Is that the best way to create a lookup table for a large dataset? Once I've done that, how do I actually use the clean variable names when creating figures and tables?

Thanks!


Solution

  • One option to use your clean variable via labs may look like so. A drawback of this approach is that you have to specify the name of the scale or guide you want to label and the name of the variable:

    library(ggplot2)
    
    ggplot(data=df, aes(var1))+
      geom_bar() + 
      labs(x = vars_clean[["var1"]])
    

    A second approach to overcome these drawbacks would be to make use of ggeasy::easy_labs which builds on the labelled package. Here the labels are added as attributes to the dataset.

    library(ggeasy)
    library(labelled)
    
    labelled::var_label(df) <- vars_clean
    
    ggplot(data=df, aes(var1))+
      geom_bar() + 
      easy_labs()