Search code examples
rlistfunctionnanewenvironment

Difference between applying a function() to list() and to new.env()?


Why do I get two different results if I print x$val? I get that the first one is a list and the second is an environment, but I do not understand what makes the result of x$val from the second chunk = NA

x <- list()
x$val <- 1
myfun <- function(x) {x$val <- x$val + NA} 
myfun(x)
x$val
##[1] 1
x <- new.env()
x$val <- 18
myfun <- function(x) {x$val <- x$val + NA} 
myfun(x)
x$val
##[1] NA

Solution

  • There are several issues here:

    1. Return value A function returns the value of the last statement executed and in this case both instance of myfun return x$val which is NA (adding NA to any number gives NA) so they do return the same value.

    2. Copy on modify If an object such as x is modified in a function the function creates a copy of the object and then modifies the copy. The original object outside the function is not changed.

    3. Object identity Environments have an identity independently of their contents so changing the contents of an environment does not change the identity of the environment itself -- it only changes the contents. Thus changing the contents of an environment does not cause the environment to be copied within the function. (This is similar to pointers in C where a program can modify the pointed to data without modifying the pointer itself.) On the other hand lists do not have an identity distinct from their contents. Within a function modifying the contents of a list causes the list to be copied to a new list and then the new list is modified.

    Example

    Below, we use address from pryr to track the address of the list. For environments simply printing the environment will show its address so we don't need it for that. The trace statements below cause R to show the address upon entry and upon exit.

    The address of the list is ...968 before entering the function and upon entry but after modifying it within the function it has become a new list at a new address ...200 which is local to the function and distinct from the list outside the function which is still at address ...968 .

    library(pryr)
    
    x <- list()
    x$val <- 1
    myfun_env <- function(x) {x$val <- x$val + NA} 
    trace(myfun_list, tracer = quote(print(address(x))), exit = quote(print(address(x))))
    ## [1] "myfun_list"
    address(x)
    ## [1] "000000000bbbb968"
    myfun_list(x)
    ## Tracing myfun_list(x) on entry 
    ## [1] "000000000bbbb968"
    ## Tracing myfun_list(x) on exit 
    ## [1] "000000000b368200"
    ## [1] NA
    address(x)
    ## [1] "000000000bbbb968"
    

    On the other hand in the case of an environment it has an identity distinct from its contents so changing the contents does not cause the environment to be copied to a new environment. The environment starts out at ...238 and never changes throughout the code.

    x <- new.env()
    x$val <- 18
    myfun_env <- function(x) {x$val <- x$val + NA} 
    trace(myfun_env, tracer = quote(print(x)), exit = quote(print(x)))
    ## [1] "myfun_env"
    x
    ## <environment: 0x000000000cac4238>
    myfun_env(x)
    ## Tracing myfun_env(x) on entry 
    ## <environment: 0x000000000cac4238>
    ## Tracing myfun_env(x) on exit 
    ## <environment: 0x000000000cac4238>
    x
    ## <environment: 0x000000000cac4238>