I have the following function which works only for quoted variables:
library(data.table) # version 1.11.8
library(purrr)
col_count <- function(dt, vars = NULL){
dt[, .N, by = vars]
}
I have successfully created basically the same function which is able to receive both quoted and unquoted variables (thanks to the 'by' argument in data.table which receives a list or a character vector).
col_count2 <- function(dt, ...){
vars <- NULL
try(if(is.character(...)) {vars <- unlist(eval(substitute(...)))}, silent = TRUE)
if(is.null(vars)) vars <- substitute(list(...))
dt[, .N, by = vars]
}
dt_iris <- data.table::as.data.table(iris)
identical(col_count2(dt_iris, Petal.Width, Species),
col_count2(dt_iris, c('Petal.Width', 'Species')))
[1] TRUE
Now I want my function to be able to be iterated with purrr::map (or lapply) as follows:
purrr::map(colnames(dt_iris), ~ col_count(dt_iris, .))
which works.
But
purrr::map(colnames(dt_iris), ~ col_count2(dt_iris, .))
Error in eval(bysub, x, parent.frame()) : object '.' not found
This is a message from data.table, which receives '.' as the vars variable in the 'by =' expression. purrr::map sends '.' and as this is not a character it goes directly to:
vars <- substitute(list(...))
I'm guessing there's a better way to solve the quote/unquote problem and stay within data.table
.
We can use :
library(data.table)
col_count2 <- function(dt, ...){
vars <- NULL
tryCatch({vars <- list(...)[[1]]}, error = function(e) {})
if(!is.character(vars)) vars <- as.character(substitute(list(...))[-1])
dt[, .N, by = vars]
}
identical(col_count2(dt_iris, Petal.Width, Species),
col_count2(dt_iris, c('Petal.Width', 'Species')))
#[1] TRUE
purrr::map(colnames(dt_iris), col_count2, dt = dt_iris)
#[[1]]
# Sepal.Length N
# 1: 5.1 9
# 2: 4.9 6
# 3: 4.7 2
# 4: 4.6 4
#....
#[[2]]
# Sepal.Width N
# 1: 3.5 6
# 2: 3.0 26
# 3: 3.2 13
# 4: 3.1 11
#....
#....