Search code examples
rdplyr

tag many-to-many relationship when using left_join in dplyr


Is there a method(s) to tag "many-to-many、one-to-many and many-to-one" relationship, like in the example below?

df1 <- data.frame(
  id1 = c(1, 1, 3, 4),
  value1 = c("A", "B", "C", "D")
)

df2 <- data.frame(
  id2 = c(1, 1, 3, 3, 4),
  value2 = c("X", "Y", "Z1", "Z2", "W")
)

# Performing a left join to tag the many-to-many relationship and adding a tag variable
result <- df1 %>%
  left_join(df2, by = c("id1" = "id2")) %>%
  group_by(id1) %>%
  mutate(
    tag = if_else(n() > 1, "many-to-many", "one-to-many")
  ) %>%
  ungroup()

result
# A tibble: 7 × 5
    id1 value1   id2 value2 tag        
  <dbl> <chr>  <dbl> <chr>  <chr>      
1     1 A          1 X      many-to-many
2     1 A          1 Y      many-to-many
3     1 B          1 X      many-to-many
4     1 B          1 Y      many-to-many
5     3 C          3 Z1     many-to-many
6     3 C          3 Z2     many-to-many
7     4 D          4 W      one-to-many

In the example above, I have tagged the "many-to-many、one-to-many" but not many-to-one. How can I do that?


Solution

  • # adding rows to both of the dataframes, to demonstrate the # many-to-one join
    df1 <- df1 |> add_row(id1 = c(5, 5), value1 = c("E", "F"))
    df2 <- df2 |> add_row(id2 = 5, value2 = "Z3")
    
    # check if there is more than one unique value 
    # in each group.
    # note: this approach doesn't work if you have identical rows, I think!
    f <- \(x) unique(x) %>% {ifelse(length(.) > 1, "many", "one")}
    
    full_join(df1, df2, by = c("id1" = "id2")) |>
      mutate(tag = paste0(f(value1), "-to-", f(value2)), .by = id1)
    

    Output:

      id1 value1 value2          tag
    1   1      A      X many-to-many
    2   1      A      Y many-to-many
    3   1      B      X many-to-many
    4   1      B      Y many-to-many
    5   3      C     Z1  one-to-many
    6   3      C     Z2  one-to-many
    7   4      D      W   one-to-one
    8   5      E     Z3  many-to-one
    9   5      F     Z3  many-to-one