Search code examples
rfunctionfunctional-programmingreadr

My file reading function using the additional parameters construct ... and optional parameters doesn't work


I'm writing a function read_list_if whose inputs are:

  • a list files_list of files to read
  • a function read_func to read each file
  • and optionally a function select_func to skip files which don't satisfy a certain boolean condition.

The full code is

read_func <- function(...){
  read_csv(..., 
           col_types = cols(
             .default= col_integer()),
           col_names = TRUE)
}


read_list_if <- function(files_list, read_func, select_func = NULL, ...){

  if (is.null(select_func)) {

    read_and_assign <- function(dataset, read_func, ...){
      dataset_name <- as.name(dataset)
      dataset_name <- read_func(dataset, ...)
      return(dataset_name)
    }

  } else

    read_and_assign <- function(dataset, read_func, select_func, ...){
      dataset_name <- as.name(dataset)
      dataset_name <- read_func(dataset,...)
      if (select_func(dataset_name)) {
        return(dataset_name)
      } 
      else return(NULL)
    }

  # invisible is used to suppress the unneeded output
  output <- invisible(
    sapply(files_list,
           read_and_assign, read_func = read_func, 
           select_func = select_func, ..., 
           simplify = FALSE, USE.NAMES = TRUE))

}


library(readr)
files <- list.files(pattern = "*.csv")
datasets <- read_list_if(files, read_func)

Save the code in a script (e.g., test.R) in the same directory with at least one .csv file (even an empty one, created with touch foo.csv, will work). If you now source("test.R"), you get the error:

Error in read_csv(..., col_types = cols(.default = col_integer()), col_names = TRUE) : 
  unused argument (select_func = NULL)

The weird thing is that if there is no .csv file in the directory, then no error is produced. I guess this happens because, when the first argument to sapply, i.e. files_list, is an empty list, then the rest of the arguments are not evaluated (R lazy evaluation).


Solution

  • The easiest fix would probably be to "slurp up" the null select_func parameter in your read_and_assign function. This will prevent it from being passed through the ... parameter.

    # ....
    if (is.null(select_func)) {
    
      read_and_assign <- function(dataset, read_func, select_func, ...){
        dataset_name <- as.name(dataset)
        dataset_name <- read_func(dataset, ...)
        return(dataset_name)
      }
    
    } else
    # ....