Search code examples
rexpressionnon-standard-evaluation

canonical NSE differentiation between names and expressions


Is there a canonical base-R method to determine if a function argument is an object name vice a literal/expression?

While NSE is typically discouraged, occasionally somebody has a good idea and wants to use it. The simplest use-case that I might justify as "convenient" is data.frame: if you include saved vectors, it will use the object name as the column name. (In fact, many classes seem to teach this as the best/only way to make frames.)

vec <- 1234:1235
data.frame(vec)
#    vec
# 1 1234
# 2 1235

But feed the raw vector, and it does not go as well:

data.frame(1234:1235)
#   X1234.1235
# 1       1234
# 2       1235

A common way in base R to do this form of NSE is to use deparse(substitute(x)), which will return a string of either the object name ("vec") or the expression passed as a literal ("1234:1235").

Some is.* functions exist (e.g., is.object, is.expression), though they are for different purposes. I wouldn't want to rely on is.vector or similar structure-specific functions, as that doesn't generalize into more-complicated structured parameters (without significant customization).

My thought was to try(get(.)) the object, and if that fails for whatever reason (typically "not found"), then the argument is most likely an expression or literal. (This does not work if the NSE object is not found: library(ggplot2), and there is not likely to be an object named "ggplot2" in the calling environment.)

For example,

func <- function(x, ...) {
  xname <- deparse(substitute(x))
  isobj <- !inherits(try(get(xname), silent = TRUE), "try-error")
  if (isobj) "yes" else "no"
}

func(zz)
# [1] "yes"
func(c(zz))
# [1] "no"

(Assume that func is meant to do more with isobj than just return a string.)

Extending that method with a technique for dealing with the ellipsis, I can do:

my_names <- function(...) {
  dot_obj_expr <- sapply(eval(substitute(alist(...))), deparse)
  nms <- names(dot_obj_expr)
  if (is.null(nms)) nms <- rep("", length(dot_obj_expr))
  hasgoodnames <- nzchar(nms) |
    !sapply(dot_obj_expr, function(obj) inherits(try(get(obj), silent = TRUE), "try-error"))
  # rational starting point
  goodnames <- sprintf("V%i", seq_along(dot_obj_expr))
  # replace those that were already named
  goodnames[nzchar(nms)] <- nms[nzchar(nms)]
  # replace those that were not named but appear to be an object-name
  goodnames[!nzchar(nms) & hasgoodnames] <- dot_obj_expr[!nzchar(nms) & hasgoodnames]
  goodnames
}

zz <- 1
my_names(zz, abc = 1:3, 1:3)
# [1] "zz"  "abc" "V3" 

### less clear when an arg is not found
rm(zz)
my_names(zz, abc = 1:3, 1:3)
# [1] "V1"  "abc" "V3" 

which provides a (perhaps) more-reasonable convention for using the names.

I haven't done benchmarking or profiling to know if this would have any significant "penalty". I believe that try(get(.)) is fairly efficient, and since the code so far does not alter any of the objects, I see no memory-copy being needed.

(This question is informed by advanced-R, http://adv-r.had.co.nz/Computing-on-the-language.html.)


Solution

  • I'm not sure if it's helpful, but it seems that you just want to check of the parameter is a symbol vs something else. You can do that with

    func <- function(x, ...) {
      xobj <- substitute(x)
      if (is.symbol(xobj)) "yes" else "no"
    }
    func(zz)
    # [1] "yes"
    func(c(zz))
    # [1] "no"
    

    This will return "yes" for zz whether or not the variable zz actually exists which may or may not be what you want.