I have a simple helper function that applies left_join
to any number of passed tables in other to gather
them and return one object.
# Settings ----------------------------------------------------------------
library("tidyverse")
set.seed(123)
# Data --------------------------------------------------------------------
sample_one <-
tibble(
column_a = c(1, 2),
column_b = runif(n = 2),
column_other = runif(n = 2)
)
sample_two <-
tibble(
column_a = c(1, 2),
column_b = runif(n = 2),
column_other = runif(n = 2)
)
sample_three <-
tibble(
column_a = c(1, 2),
column_b = runif(n = 2),
column_other = runif(n = 2)
)
# Function ----------------------------------------------------------------
left_join_on_column_a <- function(keep_var, ...) {
keep_var <- enquo(keep_var)
dots <- list(...)
clean_dfs <- map(dots, select, !!keep_var, "column_a")
reduce(.x = clean_dfs,
.f = left_join,
"column_a") %>%
gather(key = "model_type", !!keep_var, -column_a)
}
# Test --------------------------------------------------------------------
left_join_on_column_a(keep_var = column_b, sample_one, sample_two, sample_three)
I would like to be able to programmatically modify the suffix
argument of left_join
:
suffix If there are non-joined duplicate variables in x and y, these suffixes will be added to the output to disambiguate them. Should be a character vector of length 2.
# A tibble: 6 x 3
column_a model_type column_b
<dbl> <chr> <dbl>
1 1 column_b.x 0.288
2 2 column_b.x 0.788
3 1 column_b.y 0.940
4 2 column_b.y 0.0456
5 1 column_b 0.551
6 2 column_b 0.457
# A tibble: 6 x 3
column_a model_type column_b
<dbl> <chr> <dbl>
1 1 sample_one 0.288
2 2 sample_one 0.788
3 1 sample_two 0.940
4 2 sample_two 0.0456
5 1 sample_three 0.551
6 2 sample_three 0.457
The model_type
column reflects name of the object passed via ...
.
I was trying to capture names of the objects passed within ...
but it's not a named object so it doesn't make sense:
left_join_on_column_a <- function(keep_var, ...) {
keep_var <- enquo(keep_var)
dots <- list(...)
table_names <- names(dots)
clean_dfs <- map(dots, select, !!keep_var, "column_a")
reduce(.x = clean_dfs,
.f = left_join,
"column_a",
table_names) %>%
gather(key = "model_type", !!keep_var, -column_a)
}
Maybe rename column_b
so that you don't have to worry about suffix
left_join_on_column_a <- function(keep_var, common_var, ...) {
nm = unname(sapply(rlang::enexprs(...), as.character))
keep_var <- as.character(substitute(keep_var))
common_var = as.character(substitute(common_var))
foo = function(x, y) {
x %>% select(!!common_var, !!y := !!keep_var)
}
reduce(.x = Map(foo, list(...), nm),
.f = left_join,
common_var) %>%
gather("model_type", !!keep_var, -!!common_var)
}
left_join_on_column_a(column_b, column_a, sample_one, sample_two, sample_three)