Search code examples
rfunctionscopelazy-evaluationevaluation

R: environment diagram for decorator function


I want to draw an environment diagram for the following code which contains an error to understand how R works exactly when evaluating a function.

# emphasize text
emph <- function(f, style = '**') {
  function(...) {
    if (length(style) == 1) {
      paste(style, f(...), style)
    } else {
      paste(style[1], f(...), style[2])
    }
  }
}

# function to be decorated
tmbg <- function() {
  'tmbg are okay'
}

# a decorator function with self-referencing name
tmbg <- emph(tmbg)

I got error while evaluating the call expression of the decorator function

tmbg()
> Error: evaluation nested too deeply: infinite recursion / options(expressions=)?

I could understand this is related to the lazy evaluation of function parameter in R. It feels like when evaluating tmbg() in global frame, the name of f used in the returned anonymous function binds again to tmbg in global frame which again returns the anonymous function and calls f, thus leads to infinite recursive calls. But this image is not so clear to me because I don't exactly know what is the evaluation model used in R especially with this "lazy evaluation".

Below I draw the essential parts of the environment diagrams and explain the evaluation rule used in Python for the equivalent code. I hope to get such environment diagrams for R as well, or at least get the same level of clarity for the environmental model used in R.

# This is the equivalent python code 
def emph(f, style = ['**']):
    def wrapper(*args):
        if len(style) == 1:
            return style[0] + f(*args) + style[0]
        else:
            return style[0] + f(*args) + style[1]
    return wrapper
    
def tmbg():
    return 'tmbg are okay'

tmbg = emph(tmbg)

tmbg()

When evaluating the assignment statement at line 12 tmbg = emph(tmbg), the call expression emph(tmbg) needs to be evaluated first. When evaluating the operator of the call expression, its formal parameter f binds to name tmbg in global frame which binds to a function we defined in global frame, as shown in the picture below.

enter image description here

Next, after finishing the evaluation of the call expression emph(tmbg), its returned function wrapper binds to the name tmbg in global frame. However the binding of f and the actual function tmbg is still hold in the local frame created by emph (f1 in the diagram below).

Therefore when evaluating tmbg() in global frame, there won't be any confusion about which is the decorator function (tmbg in global) and which is the function to be decorated (f in local frame). This is the different part compared to R.

It looks like what R does is that it changes the binding from f -> function tmbg() to f -> name tmbg in global frame, which again binds to function wrapper(*args) calling f itself and thus leads to this infinite recursion. But it might also be a completely different model that R does not really bind f to any object but a name tmbg and ignores what that name represents. When it starts to evaluate, it looks for name tmbg and it finds the global one which is created by tmbg <- emph(tmbg) and gets infinite recursion. But this sounds really weird as the local scope created by the function call does not count anymore (or partially counts) for the purpose of "lazy evaluation" as soon as we pass an expression as argument of that function. There has to be then a system running parallelly other than the environments created by the function calls managing the namespaces and the scopes.

In either case, it is not clear to me the environmental model and evaluation rule R. I want to be clear on these and draw an environment diagram for the R code as clear as the one below if possible.

enter image description here


Solution

  • The problem is not understanding environments. The problem is understanding lazy evaluation.

    Due to lazy evaluation f is just a promise which is not evaluated until the anonymous function is run and by that time tmbg has been redefined. To force f to be evaluated when emph is run add the marked ### force statement to force it. No other lines are changed.

    In terms of environments the anonymous function gets f from emph and in emph f is a promise which is not looked up in the caller until the anonymous function is run unless we add the force statement.

    emph <- function(f, style = '**') {
      force(f)  ###
      function(...) {
        if (length(style) == 1) {
          paste(style, f(...), style)
        } else {
          paste(style[1], f(...), style[2])
        }
      }
    }
    
    # function to be decorated
    tmbg <- function() {
      'tmbg are okay'
    }
    
    # a decorator function with self-referencing name
    tmbg <- emph(tmbg)
    
    tmbg()
    ## [1] "** tmbg are okay **"
    

    We can look at the promise using the pryr package.

    library(pryr)
    
    emph <- function(f, style = '**') {
      str(promise_info(f))
      force(f)
      cat("--\n")
      str(promise_info(f))
      function(...) {
        if (length(style) == 1) {
          paste(style, f(...), style)
        } else {
          paste(style[1], f(...), style[2])
        }
      }
    }
    
    # function to be decorated
    tmbg <- function() {
      'tmbg are okay'
    }
    
    tmbg <- emph(tmbg)
    

    which results in this output that shows that f is at first unevaluated but after force is invoked it contains the value of f. Had we not used force the anonymous function would have accessed f in the state shown in the first promise_info() output so all it would know is a symbol tmbg and where to look for it (Global Environment).

    List of 4
     $ code  : symbol tmbg
     $ env   :<environment: R_GlobalEnv> 
     $ evaled: logi FALSE
     $ value : NULL
    --
    List of 4
     $ code  : symbol tmbg
     $ env   : NULL
     $ evaled: logi TRUE
     $ value :function ()  
      ..- attr(*, "srcref")= 'srcref' int [1:8] 1 13 3 5 13 5 1 3
      .. ..- attr(*, "srcfile")=Classes 'srcfilecopy', 'srcfile' <environment: 0x00000000102c3730>