I have a dataframe in the following form:
dim1 dim2
1 Loc.1 0.325
2 Loc.2 0.325
3 Loc.3 0.321
4 Loc.4 0.256
5 Loc.5 0.255
I would like to compute the mean of each combination of two (2) elements within 'dim2'; and convert the output into a matrix; while keeping the information provided by 'dim1'.
For now, I can get pairwise means using the combn function:
combn(tab[,2],2, mean)
[1] 0.3250 0.3230 0.2905 0.2900 0.3230 0.2905 0.2900 0.2885 0.2880 0.2555
but I would like it to be displayed in a matrix-like form (which would actually be quite similar to an object of class 'dist', as I would like it to be for further analyses) like this:
Loc.1 Loc.2 Loc.3 Loc.4
Loc.2 0.325
Loc.3 0.323 0.323
Loc.4 0.290 0.291 0.289
Loc.5 0.290 0.290 0.288 0.256
(and I also need, as you may see, the information 'Loc.x')
I could not find a simple function that would directly compute pairwise calculation on my dataframe 'tab'. I could use a for loop but I feel like there should be a more straighforward way.
Any suggestion? Thank you very much!
Here is a one-liner using expand.grid
instead of combn
.
as.dist(matrix(apply(expand.grid(tab[, 2], tab[, 2]), 1, mean), 5, 5))
# 1 2 3 4
#2 0.3250
#3 0.3230 0.3230
#4 0.2905 0.2905 0.2885
#5 0.2900 0.2900 0.2880 0.2555
The reason why this works is because expand.grid
considers all possible combinations of the two column vectors tab[, 2]
, while combn
misses the diagonal elements; we then operate row-wise on the combination matrix, calculate means, and cast the vector
first as a matrix
and then as a dist
object.