Search code examples
rdplyrnse

function for dplyr with argument that defaults to "."


Let's say I want to sum over all columns in a tibble to create a new column called "total". I could do:

library(tibble)
library(dplyr)

set.seed(42)
N <- 10
Df <- tibble(p_1 = rnorm(N),
             p_2 = rnorm(N),
             q_1 = rnorm(N),
             q_2 = rnorm(N))

# Works fine
Df %>% mutate(total = apply(., 1, sum))

I could make a helper function like so,

myfun <- function(Df){
  apply(Df, 1, sum)
}

# Works fine
Df %>% mutate(total = myfun(.))

But let's say this myfun was usually going to be used in this way, i.e. within a dplyr verb function, then the "." referencing the data frame is a but superfluous, and it would be nice if the myfun function could replace this with a default value. I'd like something like this:

myfun2 <- function(Df=.){
   apply(Df, 1, sum)
}

which does not work.

Df %>% mutate(total = myfun2())
Error in mutate_impl(.data, dots) : 
 Evaluation error: object '.' not found.

Because I am not even sure how the "." works, I don't think I can formulate the question better, but basically, I want to know if there a way of saying, in effect, if the Df is not defined in myfun2, get the data-frame that is normally referenced by "."?


Solution

  • One option would be to quote the function and then evaluate with !!

    library(tidyverse)
    myfun <- function() {
       quote(reduce(., `+`))
    }
    
    r1 <- Df %>% 
              mutate(total = !! myfun())
    r1
    # A tibble: 10 x 5
    #       p_1    p_2    q_1     q_2  total
    #     <dbl>  <dbl>  <dbl>   <dbl>  <dbl>
    # 1  1.37    1.30  -0.307  0.455   2.82 
    # 2 -0.565   2.29  -1.78   0.705   0.645
    # 3  0.363  -1.39  -0.172  1.04   -0.163
    # 4  0.633  -0.279  1.21  -0.609   0.960
    # 5  0.404  -0.133  1.90   0.505   2.67 
    # 6 -0.106   0.636 -0.430 -1.72   -1.62 
    # 7  1.51   -0.284 -0.257 -0.784   0.186
    # 8 -0.0947 -2.66  -1.76  -0.851  -5.37 
    # 9  2.02   -2.44   0.460 -2.41   -2.38 
    #10 -0.0627  1.32  -0.640  0.0361  0.654
    

    Note that the reduce was used to be more in align with tidyverse, but the OP's function can also be quoted and get the same result

    myfun2 <- function() {
       quote(apply(., 1,  sum ))
    }
    
    r2 <- Df %>%
            mutate(total = !! myfun2())
    all.equal(r2$total, r1$total)
    #[1] TRUE