Search code examples
rggplot2user-defined-functions

trouble creating function to create plots in ggplot2 R


I would like to define a function to efficiently create multiple plots instead of repeating lines of code. I am following on the excellent examples here https://wilkelab.org/SDS375/slides/functional-programming.html#1

Basic plot works:

require(dplyr)
mydf <- data.frame("category" = as.factor(sample(c("type1", "type2"), 10, replace = TRUE)),
                   "var1" = runif(10, min = 0, max = 100),
                   "var2" = runif(10, min = 50, max = 150))

plot <- mydf %>% 
  ggplot() + 
  aes(x = category, y = var1, color = category) +
  geom_boxplot()
plot

So far so good. But when I try to do the same with a user-defined function I seem to run into errors passing variables

make_plot <- function(data, var)  {
  data %>% 
    ggplot() +
    aes(x = .data$category, y = .data$var, color = .data$category) + 
  geom_boxplot(width = 0.2, notch = FALSE, position = position_dodge(1), lwd = 1.8, outlier.shape = NA)
}
plot <- make_plot(mydf, var1)
plot

This results in error message "! Column var not found in .data." I have also tried using "{var}" instead of ".data$var" which results in "object 'var1' not found", so it is looking for var1 but not able to find it locally within the called df. I would appreciate any solutions, and any help understanding where the problem is coming from.


Solution

  • You can use [[ accessor to extract the desired column, passing its name as character.

    make_plot <- function(data, var)  {
      data %>% 
        ggplot() +
        aes(x = .data$category, y = .data[[var]], color = .data$category) + 
        geom_boxplot(width = 0.2, notch = FALSE, position = position_dodge(1), lwd = 1.8, outlier.shape = NA)
    }
    make_plot(mydf, 'var1')
    

    I hope this helps!