Search code examples
rdata.tablepurrrquasiquotes

Make a function (which uses data.table) that supports both quoted and unquoted arguments and then works in purrr::map (or lapply)


I have the following function which works only for quoted variables:

library(data.table) # version 1.11.8
library(purrr)

col_count <- function(dt, vars = NULL){
  dt[, .N, by = vars]
}

I have successfully created basically the same function which is able to receive both quoted and unquoted variables (thanks to the 'by' argument in data.table which receives a list or a character vector).

col_count2 <- function(dt, ...){
  vars <- NULL
  try(if(is.character(...)) {vars <- unlist(eval(substitute(...)))}, silent = TRUE)

  if(is.null(vars)) vars <- substitute(list(...))

  dt[, .N, by = vars]
}
dt_iris <- data.table::as.data.table(iris)

identical(col_count2(dt_iris, Petal.Width, Species), 
          col_count2(dt_iris, c('Petal.Width', 'Species')))
[1] TRUE

Now I want my function to be able to be iterated with purrr::map (or lapply) as follows:

purrr::map(colnames(dt_iris), ~ col_count(dt_iris, .))

which works.

But

purrr::map(colnames(dt_iris), ~ col_count2(dt_iris, .))
Error in eval(bysub, x, parent.frame()) : object '.' not found 

This is a message from data.table, which receives '.' as the vars variable in the 'by =' expression. purrr::map sends '.' and as this is not a character it goes directly to:

vars <- substitute(list(...))

I'm guessing there's a better way to solve the quote/unquote problem and stay within data.table.


Solution

  • We can use :

    library(data.table)
    
    col_count2 <- function(dt, ...){
       vars <- NULL 
       tryCatch({vars <- list(...)[[1]]}, error = function(e) {})
       if(!is.character(vars)) vars <- as.character(substitute(list(...))[-1])
       dt[, .N, by = vars]
    }
    
    identical(col_count2(dt_iris, Petal.Width, Species), 
              col_count2(dt_iris, c('Petal.Width', 'Species')))
    #[1] TRUE
    
    purrr::map(colnames(dt_iris), col_count2, dt = dt_iris)
    
    #[[1]]
    #    Sepal.Length  N
    # 1:          5.1  9
    # 2:          4.9  6
    # 3:          4.7  2
    # 4:          4.6  4
    #....
    
    #[[2]]
    #    Sepal.Width  N
    # 1:         3.5  6
    # 2:         3.0 26
    # 3:         3.2 13
    # 4:         3.1 11
    #....
    #....