R - how to summarise data across pairs?

I have a dataset with the following structure:

study	treatment	severity	people_with_pain	sample_size
0001	paracetamol	5.0	12	24
0001	aspirin	7.0	10	20
0001	massage	10.2	20	21
0002	paracetamol	6.0	.	.
0002	aspirin	6.0	.	.
0003	massage	2.0	10	25
0003	paracetamol	3.5	10	25

I would like to evaluate severity and the proportion of people_with_pain within sample_size across pairs of different treatments (e.g. paracetamol-aspirin). When the outcome is continuous (severity), I am interested in the mean, standard deviation, median and IQR. When the outcome is dichotomous (people_with_pain/sample_size), I am interested in the average proportion.

For instance, for the paracetamol-aspirin comparison, severity would be 6.0 (average between study 0001, 6.0, and study 0002, 6.0. Study 0003 does not contribute because it compares massage and paracetamol). The average proportion of people_with_pain would be 0.5 (study 0001, 22/44. Study 0002 does not contribute as there is no data).

The desired output would as a data.frame per output (e.g. one data.frame for severity, one for proportion of people_with_pain) with comparisons as rows (e.g. paracetamol-aspirin):

comparison	nstudies	severity_mean	severity_sd
paracetamol-aspirin	2	6.0	0
paracetamol-massage	2	4.375	2.30
aspirin-massage	0	NA	NA

I have managed to build a permutation list of available comparison and exclude those duplicated (e.g. paracetamol-aspirin and aspirin-paracetamol), as well as those not meaningful (e.g. paracetamol-paracetamol), but I am stuck there.

Solution

Here is my solution using tidyverse. I am not completely sure that this is what you want because of some differences between my results and your expected results. However, to obtain all the combinations of treatment I've used a left_join of the dataframe with itself, followed by filter(treatment.x < treatment.y) which basically eliminates all the treatment pair duplicates.

library(dplyr)
library(tidyr)
dat |>
  left_join(dat, by = "study") |>
  filter(treatment.x < treatment.y) |>
  unite("treatment", starts_with("treatment"), sep = "-") |>
  group_by(treatment) |>
  summarize({
    severity <- c(severity.x, severity.y)
    people_with_pain <- c(people_with_pain.x, people_with_pain.y)
    sample_size <- c(sample_size.x, sample_size.y)
    data.frame(severity_mean =     mean(severity),
               severity_sd  =        sd(severity),
               severity_median = median(severity),
               severity_IQR =       IQR(severity),
               people_with_pain = sum(people_with_pain)/sum(sample_size),
               nstudies = length(unique(study)))
  })

##>             treatment severity_mean severity_sd severity_median severity_IQR
##> 1     aspirin-massage         8.600   2.2627417            8.60        1.600
##> 2 aspirin-paracetamol         6.000   0.8164966            6.00        0.500
##> 3 massage-paracetamol         5.175   3.5668614            4.25        3.175
##>   people_with_pain nstudies
##> 1        0.7317073        1
##> 2               NA        2
##> 3        0.5473684        2