Search code examples
rdata.tablense

How do I correctly use the env variable for data.tables within a function


Let us take a simple example

data <- data.table::data.table(a = 1:10, b = 2:11)
j <- quote(c("c") := list(a + 1))
data[, j, env = list(j = j)][]
#        a     b     c
#    <int> <int> <num>
# 1:     1     2     2
# 2:     2     3     3
# 3:     3     4     4
# 4:     4     5     5
# 5:     5     6     6

The above works and produces the correct output. However if I place this inside a function I get a very different output.

data <- data.table::data.table(a = 1:5, b = 2:6)
test <- function(data, ...) {
  dots <- eval(substitute(alist(...)))
  j <- call(":=", call("c", names(dots)), unname(dots))
  print(j)
  data[, j, env = list(j = j)][]
}
test(data = data, c = a + 1)
# `:=`(c("c"), list(a + 1))
#        a     b         c
#    <int> <int>    <list>
# 1:     1     2 <call[3]>
# 2:     2     3 <call[3]>
# 3:     3     4 <call[3]>
# 4:     4     5 <call[3]>
# 5:     5     6 <call[3]>

I assume that the c = a + 1 is just not being evaluated in the correct environment (i.e. the data.table itself).

EDIT: I am using data.table 1.14.3


Solution

  • It's because dots isn't a call, it's a list of calls. So when data.table evaluates j it's trying to insert that list into a new column.

    To fix this you need to splice the list of calls into a single call. You can do this in a call to ':='() directly (Option 1 below), but you can also break this into multiple steps that mirrors what you were doing above by converting dots to be a call to list() (Option 2).

    library(data.table)
    
    data <- data.table::data.table(a = 1:5, b = 2:6)
    
    # Option 1 - call to ':='
    test <- function(data, ...) {
      dots <- eval(substitute(alist(...)))
      j <- bquote(':='(..(dots)), splice = TRUE)
      print(j)
      data[, j, env = list(j = j)][]
    }
    
    # # Option 2 - convert dots to a call to a list
    # test <- function(data, ...) {
    #   dots <- eval(substitute(alist(...)))
    #   dots_names <- names(dots)
    #   dots <- bquote(list(..(unname(dots))), splice = TRUE)
    #   j <- call(":=", dots_names, dots)
    #   print(j)
    #   data[, j, env = list(j = j)][]
    # }
    
    test(data = data, c = a + 1, double_b = b * 2)
    #> `:=`(c = a + 1, double_b = b * 2)
    #>        a     b     c double_b
    #>    <int> <int> <num>    <num>
    #> 1:     1     2     2        4
    #> 2:     2     3     3        6
    #> 3:     3     4     4        8
    #> 4:     4     5     5       10
    #> 5:     5     6     6       12
    

    Edit: You can also use test2() if you want to be able to edit the same column or use newly made columns.

    test2 <- function(data, ...) {
      dots <- eval(substitute(alist(...)))
      dots_names <- names(dots)
      for (i in seq_along(dots)) {
        dot_name <- dots_names[[i]]
        dot <- dots[[i]]
        j <- call(":=", dot_name, dot)
        print(j)
        data[, j, env = list(j = j)]
      }
      data[]
    }