Search code examples
rrstudioenvironment

Environments in RStudio when sourcing


I have found that the usual way of finding the WD of a script sourced in RStudio is "dirname(parent.frame(2)$ofile)". I tried to research the meaning of this, read lots of explanations on environments, but I am still no closer to understanding this command. I ran this script:

print(environment())
print(parent.frame(1))
print(parent.frame(2))
print(parent.frame(3))
print(parent.frame(4))

f <- function() {
  print('Do:')
  print(environment())
  print(parent.frame(1))
  print(parent.frame(2))
  print(parent.frame(3))
  print(parent.frame(4))
}

f()

print("Parent 1:")
print(ls(parent.frame(1)))
print("Parent 2:")
print(ls(parent.frame(2)))

print(identical(environment(),parent.frame(1)))
print(identical(environment(),parent.frame(2)))
print(identical(environment(),parent.frame(3)))
print(identical(environment(),parent.frame(4)))

I got the output:

<environment: R_GlobalEnv>
<environment: 0x111987ab0>
<environment: 0x107fef700>
<environment: R_GlobalEnv>
<environment: R_GlobalEnv>
[1] "Do:"
<environment: 0x1119957d8>
<environment: R_GlobalEnv>
<environment: 0x111995998>
<environment: 0x107fef700>
<environment: R_GlobalEnv>
[1] "Parent 1:"
[1] "enclos" "envir"  "expr"  
[1] "Parent 2:"
 [1] "chdir"              "continue.echo"      "curr.fun"           "deparseCtrl"       
 [5] "echo"               "ei"                 "enc"                "encoding"          
 [9] "envir"              "exprs"              "file"               "filename"          
[13] "from_file"          "have_encoding"      "i"                  "i.symbol"          
[17] "keep.source"        "lastshown"          "lines"              "loc"               
[21] "local"              "max.deparse.length" "Ne"                 "ofile"             
[25] "print.eval"         "prompt.echo"        "skip.echo"          "spaced"            
[29] "srcfile"            "srcrefs"            "tail"               "use_file"          
[33] "verbose"            "width.cutoff"       "yy"                
[1] FALSE
[1] FALSE
[1] TRUE
[1] TRUE

I am not sure I understand the output.

1) What exactly are parents 1 and 2 of the Global Environment? Where can I read more on their attributes, including the used $ofile?

2) Why is parent.frame(1) in not equivalent to parent.frame(2) from within the function? Aren't they identical - the parent of Global Env?

3) Why does parent.frame start returning global environment when numbers get sufficiently big? Is this just how the function is written or is there some logic to this hierarchy?


Solution

  • When you go up the parent.frame, you can see the environments of the functions that are calling your function. The source() function has a lot of code that makes it work. It doesn't just dump the commands into your console. Basically it's running something like

    source <- function(...) {
       ...
       eval(ei, envir)
       ...
    }
    

    where ei is one of the expressions in your file. Then eval looks like this

    eval <- function (expr, envir , enclos = ) {
        .Internal(eval(expr, envir, enclos))
    }
    

    So when you call the first parent.frame() from a function that you call in a file that's sourced, it's going to see the eval() call first. If you look at formals(eval) you can see that it has those three variables that are in your first parent. The second parent lists all the variables that are created in the source() function itself, Including the ei variable we just saw. So heres where those values are

    # parent.frame(4)
    # parent.frame(3)
    source <- function(...) {
       # parent.frame(2)
       eval(ei, envir)
    }
    eval <- function (expr, envir , enclos = ) {
         # parent.frame(1)
        .Internal(eval(expr, envir, enclos))
                      # ^^ your code
    }
    

    But variable resolution in R doesn't look in environments where functions are called from. Rather it uses lexical scoping so it looks where a function is defined (not called). If you want to get that environment, you can call parent.env(environment()) from inside the function instead. With a simple function, you should get the global environment. So really this means that parent.frame is just an unfortunate name because that's not "really" what it is.