Imagine you have two dataframes in R, for example. Imagine that columns could have any kind of names, not always a or b. Also there could be much more columns, dozens or even hundreds.
a <-
data.frame(
id = seq(1, 4),
a = c(0, 0, 1, 1),
b = c(3, 3, 1, 1)
)
b <-
data.frame(
id = seq(1, 4),
a = c("0%", "0%", "50%", "50%"),
b = c("37.5%", "37.5%", "12.5%", "12.5%")
)
I'd like to merge it to following result:
result <-
data.frame(
id = seq(1, 4),
a = c("0 (0%)", "0 (0%)", "1 (50%)", "1 (50%)"),
b = c("3 (37.5%)", "3 (37.5%)", "1 (12.5%)", "1 (12.5%)")
)
This seems like a join with some string-concatenation:
library(dplyr)
full_join(a, b, by = "id") %>%
mutate(across(ends_with(".x"), ~ sprintf("%s (%s)", .x, pick(sub("x$", "y", cur_column()))[[1]]), .names = "{sub('.x$', '', .col)}")) %>%
select(-matches("\\.[xy]$"))
# id a b
# 1 1 0 (0%) 3 (37.5%)
# 2 2 0 (0%) 3 (37.5%)
# 3 3 1 (50%) 1 (12.5%)
# 4 4 1 (50%) 1 (12.5%)
It's certainly possible to do this in base R or data.table
, if needed, though with a little more code.
tmp <- merge(a, b, by = "id")
tmp
# id a.x b.x a.y b.y
# 1 1 0 3 0% 37.5%
# 2 2 0 3 0% 37.5%
# 3 3 1 1 50% 12.5%
# 4 4 1 1 50% 12.5%
xs <- grep("\\.x$", names(tmp))
ys <- grep("\\.y$", names(tmp))
Map(function(m, n) sprintf("%s (%s)", m, n), tmp[,xs], tmp[,ys]) |>
setNames(sub(".x$", "", xs)) |>
cbind(tmp[,-c(xs, ys), drop=FALSE])
# 2 3 id
# 1 0 (0%) 3 (37.5%) 1
# 2 0 (0%) 3 (37.5%) 2
# 3 1 (50%) 1 (12.5%) 3
# 4 1 (50%) 1 (12.5%) 4