I have a dataframe with a column of ids, but for some rows there are multiple ids concatenated together. I want to merge this onto another dataframe using the id, and when the ids are concatenated it handles that and reflects it by having the values in the new columns added also concatenated.
For example I have dataframes
data <- data.frame(
id = c(1, 4, 3, "2,3", "1,4"),
value = c(1:5)
)
> data
id value
1 1 1
2 4 2
3 3 3
4 2,3 4
5 1,4 5
mapping <- data.frame(
id = 1:4,
name = c("one", "two", "three", "four")
)
> mapping
id name
1 1 one
2 2 two
3 3 three
4 4 four
I would like to end up with
id value name
1 1 1 one
2 4 2 four
3 3 3 three
4 2,3 4 two,three
5 1,4 5 one,four
I don't think there's a good way to do this other than to separate, join, and re-concatenate:
library(dplyr)
library(tidyr)
data %>%
mutate(true_id = row_number()) %>%
separate_rows(id, convert = TRUE) %>%
left_join(mapping, by = "id") %>%
group_by(true_id, value) %>%
summarize(id = toString(id), name = toString(name), .groups = "drop")
# # A tibble: 5 × 4
# true_id value id name
# <int> <int> <chr> <chr>
# 1 1 1 1 one
# 2 2 2 4 four
# 3 3 3 3 three
# 4 4 4 2, 3 two, three
# 5 5 5 1, 4 one, four
I wasn't sure if your value
column would actually be unique, so I added a true_id
just in case.