r function global-variables return-value

Alternative to writing to global variable from within function

I've got a bit of code that works, but which I understand relies on bad practice to do so. To use a simple representation of the problem, take the code;

operation <- function(index){
  a <- 0
  if(data[index] == FALSE){
    data[index] <<- TRUE
    a <- a + 1}
  
  a <- a + 1
  return(a)
}

data <- c(FALSE, FALSE, FALSE)

x <- 0
x <- x + operation(sample(c(1,2,3),1))
x <- x + operation(sample(c(1,2,3),1))
x <- x + operation(sample(c(1,2,3),1))
x

The "operation" function has two purposes - firstly, to output 2 if the value specified by the inputs is FALSE or 1 if TRUE, and importantly to change the input to TRUE so that future calls of the same input return 1.

The problems with this are that the operation function references a global variable which I know for my use case will always exist, but hypothetically may not, and that the function writes to the global variable with the <<- command, which I understand is incredibly bad practice.

Is there a better-practice way to achieve the same functionality without the function writing to the global variable?

Solution

We can use object oriented programming (OOP). Compare this to using lists in another answer to see the increased clarity of using OOP once the object has been defined -- the actual code which runs the op method hardly changes from the question. 1a, 2 and 3 do not require any addon packages.

1) proto First we use the proto package for OOP. proto objects are environments with certain added methods. Here p is a proto object that contains data and also a method op. Note that with proto we can avoid the use of <<- and unlike class-based object oriented systems proto allows definitions of objects, here p is an object, without needing classes.

library(proto)

p <- proto(op = function(., index) {
  a <- 0
  if( ! .$data[index] ) {
    .$data[index] <- TRUE
    a <- a + 1
  }
  a <- a + 1
  return(a)
})

p$data <- c(FALSE, FALSE, FALSE)

x <- 0
x <- x + p$op(sample(c(1,2,3),1))
x <- x + p$op(sample(c(1,2,3),1))
x

p$data

1a A variation of this is to use just use plain environments.

e <- local({
  op <- function(index) {
    a <- 0
    if( ! data[index] ) {
      data[index] <<- TRUE
      a <- a + 1
    }
    a <- a + 1
    return(a)
  }
  environment()
})

e$data <- c(FALSE, FALSE, FALSE)

x <- 0
x <- x + e$op(sample(c(1,2,3),1))
x <- x + e$op(sample(c(1,2,3),1))
x

e$data

2) Reference Classes Reference classes for OOP come with R and do not require any packages. This may be overkill since it requires creating a class which only ever instantiates one object whereas with proto we can directly generate an object without this extra step.

MyClass <- setRefClass("MyClass", fields = "data",
  methods = list(
    op = function(index) {
       a <- 0
       if( ! data[index] ) {
         data[index] <<- TRUE
         a <- a + 1
       }
       a <- a + 1
       return(a)
    }
  )
)

obj <- MyClass$new(data = c(FALSE, FALSE, FALSE))
x <- 0
x <- x + obj$op(sample(c(1,2,3),1))
x <- x + obj$op(sample(c(1,2,3),1))
x

obj$data

3) scoping It is possible to devise a poor man's OOP system that works with R by making use of function scoping. Try demo(scoping) for another example. This also does not require any packages. It does have the disadvantage of (2) that it requires the definition of a class which is only used once.

cls <- function(data = NULL) {
  list(
    put_data = function(x) data <<- x,
    get_data = function() data,
    op = function(index) {
      a <- 0
      if( ! data[index] ) {
        data[index] <<- TRUE
        a <- a + 1
      }
      a <- a + 1
      return(a)
    }
  )
}

obj <- cls(data = c(FALSE, FALSE, FALSE)) 
x <- 0
x <- x + obj$op(sample(c(1,2,3),1))
x <- x + obj$op(sample(c(1,2,3),1))
x

obj$get_data()

4) You can also explore R6, R.oo and oops which are other CRAN packages that implement OOP in R.