Search code examples
rdplyrnse

Pass multiple arguments to ddply


I am attempting to create a function which takes a list as input, and returns a summarised data frame. However, after trying multiple ways, I am unable to pass a list to the function for the aggregation.

So far I have the following, but it is failing.

library(dplyr)

random_df <- data.frame(
  region = c("A", "B", "C", "C"),
  number_of_reports = c(1, 3, 2, 1),
  report_MV = c(12, 33, 22, 12)
)

output_graph <- function(input) {
    print(input$arguments)
    DF <- input$DF
    group_by <- input$group_by
    args <- input$arguments
    flow <- ddply(DF, group_by, summarize, args)
    return(flow)
}

graph_functions <- list(
    DF = random_df,
    group_by = .(region),
    arguments = .(Reports = sum(number_of_reports),
                  MV_Reports = sum(report_MV))
)

output_graph(graph_functions)

Where this works:

library(dplyr)

random_df <- data.frame(
  region = c("A", "B", "C", "C"),
  number_of_reports = c(1, 3, 2, 1),
  report_MV = c(12, 33, 22, 12)
)

output_graph <- function(input) {
    print(input$arguments)
    DF <- input$DF
    group_by <- input$group_by
    args <- input$arguments
    flow <- ddply(
      DF,
      group_by, 
      summarize,
      Reports = sum(number_of_reports),
      MV_Reports = sum(report_MV)
    )
    return(flow)
}

graph_functions <- list(
  DF = random_df,
  group_by = .(region),
  arguments = .(Reports = sum(number_of_reports),
                MV_Reports = sum(report_MV))
)

output_graph(graph_functions)

Would anyone be aware of a way to pass a list of functions to ddply? Or another way to achieve the same goal of aggregating a dynamic set of variables.


Solution

  • In order to pass arguments into the function for use by dplyr, I recommend reading this regarding non-standard evaluation (NSE). Here is an edited function producing the same output as my original.

    library(dplyr)
    
    random_df <- data.frame(
      region = c('A','B','C','C'),
      number_of_reports = c(1, 3, 2, 1),
      report_MV = c(12, 33, 22, 12)
    )
    
    output_graph <- function(df, group, args) {
    
      grp_quo <- enquo(group)
    
      df %>%
        group_by(!!grp_quo) %>%
        summarise(!!!args)
    
    }
    
    args <- list(
      Reports = quo(sum(number_of_reports)),
      MV_Reports = quo(sum(report_MV))
    )
    
    output_graph(random_df, region, args)
    
    # # A tibble: 3 x 3
    #   region Reports MV_Reports
    #   <fctr>   <dbl>      <dbl>
    # 1 A         1.00       12.0
    # 2 B         3.00       33.0
    # 3 C         3.00       34.0