Search code examples
rnon-standard-evaluationr-environment

Why must local({...}) be defined using two rounds of expression quoting?


I'm trying to understand how R's local function is working. With it, you can open a temporary local scope, which means what happens in local (most notably, variable definitions), stays in local. Only the last value of the block is returned to the outside world. So:

x <- local({
    a <- 2
    a * 2
}) 

x
## [1] 4

a
## Error: object 'a' not found

local is defined like this:

local <- function(expr, envir = new.env()){
    eval.parent(substitute(eval(quote(expr), envir)))
}

As I understand it, two rounds of expression quoting and subsequent evaluation happen:

  1. eval(quote([whatever expr input]), [whatever envir input]) is generated as an unevaluated call by substitute.
  2. The call is evaluated in local's caller frame (which is in our case, the Global Environment), so [whatever expr input] is evaluated in [whatever envir input]

However, I do not understand why step 2 is nessecary. Why can't I simply define local like this:

local2 <- function(expr, envir = new.env()){
    eval(quote(expr), envir)
}

I would think it evaluates the expression expr in an empty environment? So any variable defined in expr should exist in envir and therefore vanish after the end of local2?

However, if I try this, I get:

x <- local2({
    a <- 2
    a * 2
}) 
x
## [1] 4
a
## [1] 2

So a leaks to the Global Environment. Why is this?

EDIT: Even more mysterious: Why does it not happen for:

eval(quote({a <- 2; a*2}), new.env())
## [1]  4
a
## Error: object 'a' not found

Solution

  • Parameters to R functions are passed as promises. They are not evaluated unless the value is specifically requested. So look at

    # clean up first
    if exists("a") rm(a)
    
    f <- function(x) print(1)
    f(a<-1)
    # [1] 1
    a
    # Error: object 'a' not found
    g <- function(x) print(x)
    g(a<-1)
    # [1] 1
    a
    # [1] 1
    

    Note that in the g() function, we are using the value passed to the function which is that assignment to a so that creates a in the global environment. With f(), that variable is never created because that function parameter remained a promise end was never evaluated.

    If you want to access a parameter without evaluating it, you need to use something like match.call() or subsititute(). The local() function does the latter.

    If you remove the eval.parent(), you'll see that the substitute puts the un-evaluated expression from the parameter into a new call to eval().

    h <- function(expr, envir = new.env()){
        substitute(eval(quote(expr), envir))
    }
    h(a<-1)
    # eval(quote(a <- 1), new.env())
    

    Where as if you do

    j<- function(x) {
      quote(x)
    }
    j(a<-1)
    # x
    

    you are not really creating a new function call. Further more when you eval() that expression, you are triggering the evaluation of x from it's original calling environment (triggering the evaluation of the promise), not evaluating the expression in a new environment.

    local() then uses the eval.parent() so that you can use existing variables in the environment within your block. For example

    b<-5
    local({
      a <- b
      a * 2
    })
    # [1] 10
    

    Look at the behaviors here

    local2 <- function(expr, envir = new.env()){
        eval(quote(expr), envir)
    }
    local2({a<-5; a})
    # [1] 5
    local2({a<-5; a}, list(a=100, expr="hello"))
    # [1] "hello"
    

    See how when we use a non-empty environment, the eval() is looking up expr in the environment, it's not evaluating your code block in the environment.