I have a very simple function, that returns a data frame. The function takes three parameters, a dataset, and two variables that are present in the dataframe.
I was hoping to use the map/pmap family of functions to feed a vector/list of inputs and produce a single (long) output dataset. I can't seem to get map/pmap tools to work for me. What can I try next?
The function is pretty basic:
library(dplyr)
# Function takes dataset and 2 categorical variables
# Calculates number of records for each combination of values in the two variables
# Calculates % of var 1 responses for each level of var2
ugh<-function(data, var1, var2){
# add checks to make sure vars on dataset
tab_n<-data%>%
group_by_at(c(var1, var2))%>%
summarise(Numerator=n(), .groups="drop")%>%
group_by_at(c(var2))%>%
mutate(Denominator=sum(Numerator)
,Pct=Numerator/Denominator*100
# storing names of var1 and var2 for future subsetting
, Var1=var1
, Var2=var2)%>%
rename(Var1_levels=var1
, Var2_levels=var2
)
}
# Sample output
combo1<-mtcars%>%ugh(var1="cyl", var2="gear")
# can also run this as:
# combo1<-ugh(data=mtcars, var1="cyl", var2="gear")
combo2<-mtcars%>%ugh(var1="cyl", var2="carb")
sampleOutput<-rbind(combo1, combo2)
# Trying to use map to generate sampleOutput
var1_vector=rep("cyl", 2)
var2_vector=c("gear", "carb")
plswork<-mtcars%>%
map2_dfr(var1=var1_vector, var2=var2_vector, ugh)
The error message I get is:
Error in as_mapper(.f, ...) : argument ".f" is missing, with no default
I've tried using ~
to specify the function and I've tried using map2 and binding rows separately, also tried pmap with a list of inputs... but am not having much luck.
(I am interested also in more efficient ways to summarise a subset of columns from a data frame by a different subset of columns.)
There are a few solutions to this problem. I recommend the first because I think its the most clear. I also tweaked your function to fix any use of deprecated/superseded functions/behaviour.
library(dplyr)
library(purrr)
# Function takes dataset and 2 categorical variables
# Calculates number of records for each combination of values in the two variables
# Calculates % of var 1 responses for each level of var2
ugh<-function(data, var1, var2){
# add checks to make sure vars on dataset
tab_n<-data%>%
group_by(across(all_of(c(var1, var2))))%>%
summarise(Numerator=n(), .groups="drop")%>%
group_by(across(all_of(var2)))%>%
mutate(Denominator=sum(Numerator)
,Pct=Numerator/Denominator*100
# storing names of var1 and var2 for future subsetting
, Var1= .env$var1
, Var2= .env$var2)%>%
rename(Var1_levels= all_of(var1)
, Var2_levels= all_of(var2)
)
}
# Sample output
combo1<-mtcars%>%ugh(var1="cyl", var2="gear")
# can also run this as:
# combo1<-ugh(data=mtcars, var1="cyl", var2="gear")
combo2<-mtcars%>%ugh(var1="cyl", var2="carb")
sampleOutput<-rbind(combo1, combo2)
sampleOutput
#> # A tibble: 17 × 7
#> # Groups: Var2_levels [7]
#> Var1_levels Var2_levels Numerator Denominator Pct Var1 Var2
#> <dbl> <dbl> <int> <int> <dbl> <chr> <chr>
#> 1 4 3 1 15 6.67 cyl gear
#> 2 4 4 8 12 66.7 cyl gear
#> 3 4 5 2 5 40 cyl gear
#> 4 6 3 2 15 13.3 cyl gear
#> 5 6 4 4 12 33.3 cyl gear
#> 6 6 5 1 5 20 cyl gear
#> 7 8 3 12 15 80 cyl gear
#> 8 8 5 2 5 40 cyl gear
#> 9 4 1 5 7 71.4 cyl carb
#> 10 4 2 6 10 60 cyl carb
#> 11 6 1 2 7 28.6 cyl carb
#> 12 6 4 4 10 40 cyl carb
#> 13 6 6 1 1 100 cyl carb
#> 14 8 2 4 10 40 cyl carb
#> 15 8 3 3 3 100 cyl carb
#> 16 8 4 6 10 60 cyl carb
#> 17 8 8 1 1 100 cyl carb
# Trying to use map to generate sampleOutput
var1_vector=rep("cyl", 2)
var2_vector=c("gear", "carb")
# Method 1 (recommended): use of anonymous functions
map2(var1_vector, var2_vector, \(var1, var2) ugh(mtcars, var1, var2)) %>%
list_rbind()
#> # A tibble: 17 × 7
#> # Groups: Var2_levels [7]
#> Var1_levels Var2_levels Numerator Denominator Pct Var1 Var2
#> <dbl> <dbl> <int> <int> <dbl> <chr> <chr>
#> 1 4 3 1 15 6.67 cyl gear
#> 2 4 4 8 12 66.7 cyl gear
#> 3 4 5 2 5 40 cyl gear
#> 4 6 3 2 15 13.3 cyl gear
#> 5 6 4 4 12 33.3 cyl gear
#> 6 6 5 1 5 20 cyl gear
#> 7 8 3 12 15 80 cyl gear
#> 8 8 5 2 5 40 cyl gear
#> 9 4 1 5 7 71.4 cyl carb
#> 10 4 2 6 10 60 cyl carb
#> 11 6 1 2 7 28.6 cyl carb
#> 12 6 4 4 10 40 cyl carb
#> 13 6 6 1 1 100 cyl carb
#> 14 8 2 4 10 40 cyl carb
#> 15 8 3 3 3 100 cyl carb
#> 16 8 4 6 10 60 cyl carb
#> 17 8 8 1 1 100 cyl carb
# If you aren't using a version of R with anonymous functions:
map2(var1_vector, var2_vector, ~ ugh(mtcars, .x, .y)) %>%
list_rbind()
#> # A tibble: 17 × 7
#> # Groups: Var2_levels [7]
#> Var1_levels Var2_levels Numerator Denominator Pct Var1 Var2
#> <dbl> <dbl> <int> <int> <dbl> <chr> <chr>
#> 1 4 3 1 15 6.67 cyl gear
#> 2 4 4 8 12 66.7 cyl gear
#> 3 4 5 2 5 40 cyl gear
#> 4 6 3 2 15 13.3 cyl gear
#> 5 6 4 4 12 33.3 cyl gear
#> 6 6 5 1 5 20 cyl gear
#> 7 8 3 12 15 80 cyl gear
#> 8 8 5 2 5 40 cyl gear
#> 9 4 1 5 7 71.4 cyl carb
#> 10 4 2 6 10 60 cyl carb
#> 11 6 1 2 7 28.6 cyl carb
#> 12 6 4 4 10 40 cyl carb
#> 13 6 6 1 1 100 cyl carb
#> 14 8 2 4 10 40 cyl carb
#> 15 8 3 3 3 100 cyl carb
#> 16 8 4 6 10 60 cyl carb
#> 17 8 8 1 1 100 cyl carb
# Alternatively, using pmap():
args <- list(
var1 = var1_vector,
var2 = var2_vector
)
pmap(args, ugh, mtcars) %>%
list_rbind()
#> # A tibble: 17 × 7
#> # Groups: Var2_levels [7]
#> Var1_levels Var2_levels Numerator Denominator Pct Var1 Var2
#> <dbl> <dbl> <int> <int> <dbl> <chr> <chr>
#> 1 4 3 1 15 6.67 cyl gear
#> 2 4 4 8 12 66.7 cyl gear
#> 3 4 5 2 5 40 cyl gear
#> 4 6 3 2 15 13.3 cyl gear
#> 5 6 4 4 12 33.3 cyl gear
#> 6 6 5 1 5 20 cyl gear
#> 7 8 3 12 15 80 cyl gear
#> 8 8 5 2 5 40 cyl gear
#> 9 4 1 5 7 71.4 cyl carb
#> 10 4 2 6 10 60 cyl carb
#> 11 6 1 2 7 28.6 cyl carb
#> 12 6 4 4 10 40 cyl carb
#> 13 6 6 1 1 100 cyl carb
#> 14 8 2 4 10 40 cyl carb
#> 15 8 3 3 3 100 cyl carb
#> 16 8 4 6 10 60 cyl carb
#> 17 8 8 1 1 100 cyl carb