Search code examples
rleft-joingroupmutate

Is there an R code that checks whether all values in a group are the same as all values in another group?


Data I have:

A B
1 a
2 c
2 e
3 f
4 h
5 c
5 e

What I want:

If all values of B in different groups of A match, then they’re in the same group. In this case, all B values (c and e) match for 2 and 5.

A B Group
1 a 1
2 c 2
2 e 2
3 f 3
4 h 4
5 c 2
5 e 2

Code I attempted:

library(readxl)
library(dplyr)
library(stringr)
data1 <- read_excel("testing.xlsx")
data2 <- data1 %>% 
  group_by(A) %>% 
  group_by(B) %>% 
  mutate(Group = cur_group_id()) %>% 
  ungroup()

What I’m getting from this code:

A B Group
1 a 1
2 c 2
2 e 3
3 f 4
4 h 5
5 c 2
5 e 3

NOTE: The original data I am manipulating has been left-joined and then grouped.


Solution

  • One trick is to group by the list of elements that share the same "A". You can then use cur_group_id properly. I add an arrange call first to make sure that identical groups are recognized as such even if the order is different in the original data.

    library(dplyr)
    library(tidyr)
    dat |> 
      arrange(A, B) |> 
      summarise(B = list(B), .by = A) |> 
      mutate(Group = cur_group_id(), .by = B) |> 
      unnest_longer(B)
    
    #   A B Group
    # 1 1 a     1
    # 2 2 c     2
    # 3 2 e     2
    # 4 3 f     3
    # 5 4 h     4
    # 6 5 c     2
    # 7 5 e     2