Search code examples
rdataframeiterationpermutationsubtraction

Iterative function to subtract all possible permutations in R


I have a data frame...

example <- data.frame(obs_val= c(20,15,3,7,5), patient = c("pt1","pt2","pt3","pt4","pt5"))

... where every row or "patient" is a unique observation.

My goal is to generate a data frame that subtracts each patient's observed value (obs_val) from another patient's obs_val. This subtraction would be a permutation, where i.e. pt1 does not have their own obs_val subtracted from their self. Ideally, the final data frame should look something like the following:

             pt1-pt2    pt1-pt3    pt1-pt4    pt1-pt5    pt2-pt3    pt2-pt4    ...

obs_val_diff    5          17         13         15         12         8       ...

Any suggestions on solving this problem, or reformatting the final data frame?


Solution

  • Another option is to use combn to get all the combinations and then map out the subtractions.

    library(tidyverse)
    
    data.frame(t(combn(example$patient, 2))) |>
      mutate(obs_val_diff = map2_dbl(X1, X2, ~example[example$patient ==.x, "obs_val"] -
                                       example[example$patient ==.y, "obs_val"])) |>
      unite(test, X1, X2, sep = "-") |>
      pivot_wider(names_from = test, values_from = obs_val_diff)
    #> # A tibble: 1 x 10
    #>   `pt1-pt2` `pt1-pt3` `pt1-pt4` pt1-pt~1 pt2-p~2 pt2-p~3 pt2-p~4 pt3-p~5 pt3-p~6
    #>       <dbl>     <dbl>     <dbl>    <dbl>   <dbl>   <dbl>   <dbl>   <dbl>   <dbl>
    #> 1         5        17        13       15      12       8      10      -4      -2
    #> # ... with 1 more variable: `pt4-pt5` <dbl>, and abbreviated variable names
    #> #   1: `pt1-pt5`, 2: `pt2-pt3`, 3: `pt2-pt4`, 4: `pt2-pt5`, 5: `pt3-pt4`,
    #> #   6: `pt3-pt5`
    

    or in base R:

    
    apply(t(combn(example$patient, 2)), 1, 
          \(x) -diff(example[example$patient %in% x, "obs_val"])) |>
      (\(v) matrix(v, ncol = length(v)))() |>
      as.data.frame() |>
      `colnames<-`(apply(t(combn(example$patient, 2)), 1, 
                         \(x) paste(x, collapse = "-")))
    #>   pt1-pt2 pt1-pt3 pt1-pt4 pt1-pt5 pt2-pt3 pt2-pt4 pt2-pt5 pt3-pt4 pt3-pt5
    #> 1       5      17      13      15      12       8      10      -4      -2
    #>   pt4-pt5
    #> 1       2