I would like to define a function to efficiently create multiple plots instead of repeating lines of code. I am following on the excellent examples here https://wilkelab.org/SDS375/slides/functional-programming.html#1
Basic plot works:
require(dplyr)
mydf <- data.frame("category" = as.factor(sample(c("type1", "type2"), 10, replace = TRUE)),
"var1" = runif(10, min = 0, max = 100),
"var2" = runif(10, min = 50, max = 150))
plot <- mydf %>%
ggplot() +
aes(x = category, y = var1, color = category) +
geom_boxplot()
plot
So far so good. But when I try to do the same with a user-defined function I seem to run into errors passing variables
make_plot <- function(data, var) {
data %>%
ggplot() +
aes(x = .data$category, y = .data$var, color = .data$category) +
geom_boxplot(width = 0.2, notch = FALSE, position = position_dodge(1), lwd = 1.8, outlier.shape = NA)
}
plot <- make_plot(mydf, var1)
plot
This results in error message "! Column var
not found in .data
."
I have also tried using "{var}" instead of ".data$var" which results in "object 'var1' not found", so it is looking for var1 but not able to find it locally within the called df. I would appreciate any solutions, and any help understanding where the problem is coming from.
You can use [[
accessor to extract the desired column, passing its name as character.
make_plot <- function(data, var) {
data %>%
ggplot() +
aes(x = .data$category, y = .data[[var]], color = .data$category) +
geom_boxplot(width = 0.2, notch = FALSE, position = position_dodge(1), lwd = 1.8, outlier.shape = NA)
}
make_plot(mydf, 'var1')
I hope this helps!