Search code examples
rggplot2expss

Trying to implement use_labels from expss package on a ggplot


I have a data frame that I've used the expss library to apply labels to the variables. Example Data:

library(expss)
data = apply_labels(data,
state= "State",
Q1_Gender_1 = "Male",
Q1_Gender_2 = "Female")

the structure of the data ends up looking like this

dput(data)
structure(list(state = structure("Iowa", label = "State", class = c("labelled", 
"character")), Q1_Gender_1 = structure(0.11, label = "Male", class = c("labelled", 
"numeric")), Q1_Gender_2 = structure(0.89, label = "Female", class = c("labelled", 
"numeric"))), class = "data.frame", row.names = c(NA, -1L))

the plot for this data worked before I applied labels but I can't figure out how to apply use_labels now to get the plot to output with the labeled variables.

p<- data %>% 
  select(-state)%>% 
  pivot_longer(everything(), names_to="variable", values_to="value") %>%
  ggplot(aes(x = reorder(variable, value), y = value, fill = variable, text = paste0(value*100, "%"))) +
      geom_bar(stat = "identity",position = "dodge")+
      theme(axis.title.x=element_blank(),
            axis.text.x=element_blank(),
            axis.ticks.x=element_blank(),
            axis.title.y=element_blank())+
      coord_flip()+
      theme(legend.position = "none")

the vignette from expss says that I'm supposed to apply use_labels like this:

use_labels(mtcars, {
    # '..data' is shortcut for all 'mtcars' data.frame inside expression 
    ggplot(..data) +
        geom_point(aes(y = mpg, x = wt, color = qsec)) +
        facet_grid(factor(am) ~ factor(vs))
}) 

I've tried every way I can think of to apply use_labels. My understanding is that the syntax is basically use_labels(data, {exp}) but the package documentation also shows usage as use_labels(data, expr) and also that it can be used within other expss functions like calculate(data, expr, use_labels = FALSE)

I need some help to figure out how to apply the use_labels or, is there a better solution to applying labels to a dataset for use in a plot like this?


Solution

  • You need to place your data.frame as the first argument in the use_labels and use ..data as placeholder in the expression:

    library(dplyr)
    library(tidyr)
    library(ggplot2)
    library(expss)
    data = structure(list(state = structure("Iowa", label = "State", class = c("labelled", 
                                                                               "character")), Q1_Gender_1 = structure(0.11, label = "Male", class = c("labelled", 
                                                                                                                                                      "numeric")), Q1_Gender_2 = structure(0.89, label = "Female", class = c("labelled", 
                                                                                                                                                                                                                             "numeric"))), class = "data.frame", row.names = c(NA, -1L))
    use_labels(data, {
        ..data %>% 
            select(-state)%>% 
            drop_all_labels() %>% 
            pivot_longer(everything(), names_to="variable", values_to="value") %>%
            ggplot(aes(x = reorder(variable, value), y = value, fill = variable, text = paste0(value*100, "%"))) +
            geom_bar(stat = "identity",position = "dodge")+
            theme(axis.title.x=element_blank(),
                  axis.text.x=element_blank(),
                  axis.ticks.x=element_blank(),
                  axis.title.y=element_blank())+
            coord_flip()+
            theme(legend.position = "none")
    })
    

    Additionally we need to remove labels with unlab becausepivot_longer is too smart and ignore method for combining labelled class.

    use_labels just replace all names in the data.frame and in the expression with their variable labels.