Search code examples

Column reference data.table function R

I'm trying to make a function that calls to a column in the data table supplied as one of the arguments below:

df <- read.table(text = "x1 x2 y
CA 20 50
CA 30.5 100
CA 40.5 200
AZ 20.12 400
AZ 25 500
OR 86 600
OR 75 700
OR 45 800", header = TRUE)

df$x1 <- as.factor(df$x1)


make_freq <- function(df, var_name){
  df <- df 
  tb <- df[, .N, by = var_name][,prop_ := round(((N/sum(N))*100), digits = 0)][order(var_name)]
  gg1 <- ggplot(tb, aes(x = var_name, y = prop_)) +
    geom_bar(width = .35, stat = "identity", color = "darkblue", fill = "darkblue") +
    ggtitle(paste0("var_name")) +
    theme_bw() +
    theme(plot.title = element_text(size = 10)) +
    theme(axis.text.x = element_text(angle = 45)) 
  return(list(figure = gg1))

make_freq(df = df, var_name = x1)

Ideally I want to be able to run the function so that I can create the ggplot figure for any categorical variable I want using the var_name argument. I'm getting Object x1 not found error which makes me think I need to quote or unquote the var_name argument within the function or something.


    1. You should to quote the x1 cause you have no this object (this is the name of column).
    2. Argument by in data.table object may be character and df[, .N, by = var_name] is good code. But [order(var_name)] is wrong. You can use [order(get(var_name))].
    3. Cause var_name is character we need to change var_name to get(var_name) in ggplot.

    Full code:

    make_freq <- function(df, var_name){
        df <- df 
        tb <- df[, .N, by = var_name][,prop_ := round(((N/sum(N))*100), digits = 0)][order(get(var_name))]
        gg1 <- ggplot(tb, aes(x = get(var_name), y = prop_)) +
            geom_bar(width = .35, stat = "identity", color = "darkblue", fill = "darkblue") +
            ggtitle(paste0("var_name")) +
            theme_bw() +
            theme(plot.title = element_text(size = 10)) +
            theme(axis.text.x = element_text(angle = 45)) 
        return(list(figure = gg1))
    make_freq(df = df, var_name = "x1")