I am trying to get from this:
session location sequence weight INDIVIDUAL action
a1 texas 1 10 john Z1
a1 texas 2 5 peter Z2
a1 texas 3 3 ben Z1
a1 texas 4 5 peter Z5
a2 calif 1 25 esther Z3
a2 calif 2 5 peggy Z2
a2 calif 3 10 greg Z5
to this:
INDIVIDUAL1 INDIVIDUAL2 weight
john peter 10
john ben 10
peter john 5
peter ben 5
ben john 3
ben peter 3
peter john 5
peter ben 5
I am exploring a number of options, including the use of for-loops, but I am a little concern that it may take too long as my dataset gets really big. Any pointers greatly appreciated!
Thank you!
Here's a simple approach with a self-join. I'll leave the dropping of the sequence
and session
columns to you.
library(dplyr)
df %>% select(session, weight, sequence, INDIVIDUAL) %>%
inner_join(., select(., session, INDIVIDUAL), by = "session") %>%
rename(INDIVIDUAL1 = INDIVIDUAL.x, INDIVIDUAL2 = INDIVIDUAL.y) %>%
filter(INDIVIDUAL1 != INDIVIDUAL2) %>%
unique %>%
arrange(session, sequence)
# session weight sequence INDIVIDUAL1 INDIVIDUAL2
# 1 a1 10 1 john peter
# 2 a1 10 1 john ben
# 3 a1 5 2 peter john
# 4 a1 5 2 peter ben
# 5 a1 3 3 ben john
# 6 a1 3 3 ben peter
# 7 a1 5 4 peter john
# 8 a1 5 4 peter ben
# 9 a2 25 1 esther peggy
# 10 a2 25 1 esther greg
# 11 a2 5 2 peggy esther
# 12 a2 5 2 peggy greg
# 13 a2 10 3 greg esther
# 14 a2 10 3 greg peggy