I have variables that I want to make all pairwise comparisons but removing rows where the comparisons are equal (e.g., "A" == "A") and keep only one of the comparisons where only the order change, so keep "A" vs "B" OR "B" vs "A".
I have this code that does it in R:
sp.all.var = c(LETTERS[1:10])
length(sp.all.var)^2
df.pairwise = expand.grid(sp.all.var,sp.all.var)
nrow(df.pairwise)
df.pairwise.sub1 = df.pairwise[df.pairwise$Var1!=df.pairwise$Var2,]
df.pairwise.sub1$compare = apply(df.pairwise.sub1, 1, function(x) paste(sort(x), collapse = "-"))
nrow(df.pairwise.sub1)
df.pairwise.sub2 = df.pairwise.sub1[!duplicated(df.pairwise.sub1$compare), ]
nrow(df.pairwise.sub2)
I was wondering if there is a way to do it in a simpler fashion (Is there a built in function that does it? is there a package?).
You probably want combn
.
combn(LETTERS[1:10], 2, paste, collapse = "-")
#> [1] "A-B" "A-C" "A-D" "A-E" "A-F" "A-G" "A-H" "A-I" "A-J" "B-C" "B-D" "B-E"
#> [13] "B-F" "B-G" "B-H" "B-I" "B-J" "C-D" "C-E" "C-F" "C-G" "C-H" "C-I" "C-J"
#> [25] "D-E" "D-F" "D-G" "D-H" "D-I" "D-J" "E-F" "E-G" "E-H" "E-I" "E-J" "F-G"
#> [37] "F-H" "F-I" "F-J" "G-H" "G-I" "G-J" "H-I" "H-J" "I-J"
Or as a data.frame
:
as.data.frame(t(combn(LETTERS[1:10], 2, \(x) c(x, paste(x, collapse = "-")))))
#> V1 V2 V3
#> 1 A B A-B
#> 2 A C A-C
#> 3 A D A-D
#> 4 A E A-E
#> 5 A F A-F
#> 6 A G A-G
#> 7 A H A-H
#> 8 A I A-I
#> 9 A J A-J
#> 10 B C B-C
#> 11 B D B-D
#> 12 B E B-E
#> 13 B F B-F
#> 14 B G B-G
#> 15 B H B-H
#> 16 B I B-I
#> 17 B J B-J
#> 18 C D C-D
#> 19 C E C-E
#> 20 C F C-F
#> 21 C G C-G
#> 22 C H C-H
#> 23 C I C-I
#> 24 C J C-J
#> 25 D E D-E
#> 26 D F D-F
#> 27 D G D-G
#> 28 D H D-H
#> 29 D I D-I
#> 30 D J D-J
#> 31 E F E-F
#> 32 E G E-G
#> 33 E H E-H
#> 34 E I E-I
#> 35 E J E-J
#> 36 F G F-G
#> 37 F H F-H
#> 38 F I F-I
#> 39 F J F-J
#> 40 G H G-H
#> 41 G I G-I
#> 42 G J G-J
#> 43 H I H-I
#> 44 H J H-J
#> 45 I J I-J