Search code examples
rtidyversereadr

How to pass arguments to a callback function for readr::read_csv_chunked


I've been playing around with readr's read_delim_chunked functions. Based on the documentation, it's not clear how one can, or if it's possible, to pass arguments into the callback function. For instance, from the documentation example:

# Cars with 3 gears
f <- function(x, pos) {
  dplyr::filter(x, .data[["gear"]] == 3)
}

readr::read_csv_chunked(
  readr::readr_example("mtcars.csv"), 
  readr::DataFrameCallback$new(f), 
  chunk_size = 5)

# A tibble: 15 x 11
    mpg   cyl  disp    hp  drat    wt  qsec    vs    am  gear  carb
   <dbl> <int> <int> <int> <dbl> <dbl> <dbl> <int> <int> <int> <int>
 1  21.4     6   258   110  3.08 3.215 19.44     1     0     3     1
 2  18.7     8   360   175  3.15 3.440 17.02     0     0     3     2
 3  18.1     6   225   105  2.76 3.460 20.22     1     0     3     1

This works fine. But what if I wanted to parameterize the gear value? For instance,

f <- function(x, pos, gear_val) {
  dplyr::filter(x, .data[["gear"]] == gear_val)
}

readr::read_csv_chunked(
  readr::readr_example("mtcars.csv"),
  readr::DataFrameCallback$new(f, gear_val = 3),
  chunk_size = 5
)

Error in .subset2(public_bind_env, "initialize")(...) :
  unused argument (gear_val = 3)

I've tried various combinations of trying to pass a parameter through to the callback function, but it doesn't work. Does anyone have any ideas on how to do this?


Solution

  • You would use a functional / function factory in this case, e.g.

    f <- function(gear_val) {
      function(x, pos) {
        dplyr::filter(x, .data[["gear"]] == gear_val)
      }
    }
    
    readr::read_csv_chunked(
      readr::readr_example("mtcars.csv"),
      readr::DataFrameCallback$new(f(gear_val = 3)),
      chunk_size = 5
    )
    #> Parsed with column specification:
    #> cols(
    #>   mpg = col_double(),
    #>   cyl = col_double(),
    #>   disp = col_double(),
    #>   hp = col_double(),
    #>   drat = col_double(),
    #>   wt = col_double(),
    #>   qsec = col_double(),
    #>   vs = col_double(),
    #>   am = col_double(),
    #>   gear = col_double(),
    #>   carb = col_double()
    #> )
    #> # A tibble: 15 x 11
    #>      mpg   cyl  disp    hp  drat    wt  qsec    vs    am  gear  carb
    #>    <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
    #>  1  21.4    6.  258.  110.  3.08  3.22  19.4    1.    0.    3.    1.
    #>  2  18.7    8.  360.  175.  3.15  3.44  17.0    0.    0.    3.    2.
    #>  3  18.1    6.  225.  105.  2.76  3.46  20.2    1.    0.    3.    1.
    #>  4  14.3    8.  360.  245.  3.21  3.57  15.8    0.    0.    3.    4.
    #>  5  16.4    8.  276.  180.  3.07  4.07  17.4    0.    0.    3.    3.
    #>  6  17.3    8.  276.  180.  3.07  3.73  17.6    0.    0.    3.    3.
    #>  7  15.2    8.  276.  180.  3.07  3.78  18.0    0.    0.    3.    3.
    #>  8  10.4    8.  472.  205.  2.93  5.25  18.0    0.    0.    3.    4.
    #>  9  10.4    8.  460.  215.  3.00  5.42  17.8    0.    0.    3.    4.
    #> 10  14.7    8.  440.  230.  3.23  5.34  17.4    0.    0.    3.    4.
    #> 11  21.5    4.  120.   97.  3.70  2.46  20.0    1.    0.    3.    1.
    #> 12  15.5    8.  318.  150.  2.76  3.52  16.9    0.    0.    3.    2.
    #> 13  15.2    8.  304.  150.  3.15  3.44  17.3    0.    0.    3.    2.
    #> 14  13.3    8.  350.  245.  3.73  3.84  15.4    0.    0.    3.    4.
    #> 15  19.2    8.  400.  175.  3.08  3.84  17.0    0.    0.    3.    2.
    

    Created on 2018-03-12 by the reprex package (v0.2.0).