Search code examples
rdplyrtibble

From long-formated tibble compare variable pairs with its inverse equivalent and pick higher value


I'm working with a tibble that consists of a matrix transformed into long format. The matrix is simply all-vs-all values output of a function. However the output of the function differs depending on the order of provided values i.e. function(x,y) != function(y,x). Therefore I'm trying to mutate tibble so that the additional column would contain the higher outcome of the two: function(x,y) or function(y,x). I'm struggling with this but I'm pretty sure that there is an elegant solution which I cannot see.

df1 <- tibble(n1 = rep(c('a', 'b', 'c'), each=3), n2 = rep(c('a', 'b', 'c'),3), val = sample(1:100,9))

df1
n1 n2 v1 
a   a    41
a   b    94
a   c    40
b   a    85
b   b    82
b   c    35
c   a    66
c   b    70
c   c    76

So the result of mutate would look like this:

n1 n2 v1 v1.high
a   a    41 41
a   b    94 94
a   c    40 66
b   a    85 94
b   b    82 82
b   c    35 70
c   a    66 66
c   b    70 70
c   c    76 76

Solution

  • One option using dplyr could be:

    df1 %>%
        group_by(grp = paste(pmax(n1, n2), pmin(n1, n2))) %>%
        mutate(val_high = max(val)) %>%
        ungroup() %>%
        select(-grp)
    
      n1    n2      val val_high
      <chr> <chr> <int>    <int>
    1 a     a        38       38
    2 a     b        37       75
    3 a     c        34       34
    4 b     a        75       75
    5 b     b        91       91
    6 b     c        54       54
    7 c     a         3       34
    8 c     b        16       54
    9 c     c        46       46