Search code examples
rnse

when passing df$var to a function, is it possible to get the name of 'var'?


I'm writing a function where I'd like to be able to pass in variables from a data frame as atomic vectors, like df$var (e.g., mtcars$mpg).

To keep the example very simple, say the function just returns data.frame(table(df$var)):

foo.function <- function(var) {
  data.frame(table(var))
}

head(foo.function(mtcars$mpg))
#>    var Freq
#> 1 10.4    2
#> 2 13.3    1
#> 3 14.3    1
#> 4 14.7    1
#> 5   15    1
#> 6 15.2    2

Notice that the name of the tabulated variable in the returned table is the internal name of the passed object (var) rather than it's "original" name, which was mpg. Is it possible to retrieve mpg (just the name) from within the function (without changing or adding arguments)? I was inclined to say no, since R is just receiving a vector of values, but I suspect R may have this capacity based on what it can do with NSE.


Solution

  • We can use deparse/substitute to extract the column name

    foo.function <- function(var) {
       print(sub(".*\\$", "", deparse(substitute(var))))
       data.frame(table(var))
      }
    
    head(foo.function(mtcars$mpg), 4)
    #[1] "mpg"
    #   var Freq
    #1 10.4    2
    #2 13.3    1
    #3 14.3    1
    #4 14.7    1
    

    If we need to change the column name

    foo.function <- function(var) {
      nm1 <- sub(".*\\$", "", deparse(substitute(var)))
      out <- data.frame(table(var))
      names(out)[1] <- nm1
      out
     }
    
    head(foo.function(mtcars$mpg), 4)
    #  mpg Freq
    #1 10.4    2
    #2 13.3    1
    #3 14.3    1
    #4 14.7    1
    

    As @RonakShah noted in the comments, it is better to pass column names and data as separate arguments. If the limitation of the function is to pass only a single argument and it always have to be with $, then the above function would be able to retrieve the column name