Search code examples
rfunctiondplyrfunctional-programmingpurrr

Problem using select and filter when used in purrr::compose with purrr::partial in R


I'm trying to compose a set of partially applied functions using the purrr package. I'm noticing that I can do this with some functions but not others and I would like to know why (or what am I missing if it's possible)?

Specifically, I can use rename, select and head (and I've also successfully used custom functions that I defined), but I can't seem to use select and filter.

In the code below, if I uncomment any of those lines I get an error like object 'milespergallon' not found.

I've tried quoting the variables, playing in which I compose the functions, but nothing seems to work - is there a general restriction at play here?

library("purrr")
library("dplyr")

my_func = compose(
  partial(rename, "milespergallon" = mpg),         # works fine
  partial(mutate, new_col = milespergallon + cyl), # works fine
  # partial(select, milespergallon, cyl, new_col), # not sure why this one fails?
  # partial(filter, cyl > 2),                      # not sure why this one fails?
  # partial(filter, milespergallon > 20),          # not sure why this one fails?
  head,                                            # works fine
  .dir = "forward"
)

my_func(mtcars)

Uncommenting the offending lines, I get a sensible output:

my_func(mtcars)
                  milespergallon cyl disp  hp drat    wt  qsec vs am gear carb new_col
Mazda RX4                   21.0   6  160 110 3.90 2.620 16.46  0  1    4    4    27.0
Mazda RX4 Wag               21.0   6  160 110 3.90 2.875 17.02  0  1    4    4    27.0
Datsun 710                  22.8   4  108  93 3.85 2.320 18.61  1  1    4    1    26.8
Hornet 4 Drive              21.4   6  258 110 3.08 3.215 19.44  1  0    3    1    27.4
Hornet Sportabout           18.7   8  360 175 3.15 3.440 17.02  0  0    3    2    26.7
Valiant                     18.1   6  225 105 2.76 3.460 20.22  1  0    3    1    24.1

Solution

  • If you check out the ?partial help page, you'll see how you can pass unnamed parameters. This should work

    my_func = compose(
      partial(rename, "milespergallon" = mpg),
      partial(mutate, new_col = milespergallon + cyl),
      partial(select, ... = , milespergallon, cyl, new_col),
      partial(filter, ... = , cyl > 2),
      partial(filter, ... = , milespergallon > 20),
      head,                                            
      .dir = "forward"
    )
    

    You can see the function that it creates otherwise with

    partial(select,  milespergallon, cyl, new_col)
    # <partialised>
    # function (...) 
    # select(milespergallon, cyl, new_col, ...)
    

    Notice how it's passing the ... at the end of the parameter list. If you use named parameters like in the first two rename and mutate calls, this really doesn't matter and the data frame will be passed as the first parameter since it's unnamed. Compare that to

    partial(select,  ... = , milespergallon, cyl, new_col)
    # <partialised>
    # function (...) 
    # select(..., milespergallon, cyl, new_col)
    

    And now you see that the ... will come before the unnamed parameters so it will pass the data.frame in the correct part of the call.

    You could also use . %>% to make function

    my_func = compose(
      . %>% rename("milespergallon" = mpg),
      . %>% mutate(new_col = milespergallon + cyl),
      . %>% select(milespergallon, cyl, new_col),
      . %>% filter(cyl > 2),
      . %>% filter(milespergallon > 20),
      head,
      .dir = "forward"
    )
    

    or use the mapper syntax

    my_func = compose(
      ~rename(., "milespergallon" = mpg),
      ~mutate(., new_col = milespergallon + cyl),
      ~select(., milespergallon, cyl, new_col),
      ~filter(., cyl > 2),
      ~filter(., milespergallon > 20),
      head,
      .dir = "forward"
    )