I need to compute the variance in a population (array) of permutations, i.e,
Let say that I have this array of permutations:
import numpy as np
import scipy.stats as stats
a = np.matrix([[1,2,3,4,5,6], [2,3,4,6,1,5], [6,3,1,2,5,4]])
# distance between a[0] and a[1]
distance = stats.kendalltau(a[0], a[1])[0]
So, how to compute (in Python) the variance on this array, i.e, how to measure how far theses permutations are from each other ?
Regards
Aymeric
p.s: I define the distance between two permutation by the kendalltau metric
I'm not sure if that's the mathematical result you are looking for. You could use stats.kendalltau
to compute the distance for all possible pairs, then take the variance from that resulting vector.
To get the vector of distances, I loop through the zipped list (a, a-shifted)
using np.roll
:
dist = []
for x1, x2 in zip(a, np.roll(a, shift=1, axis=0)):
dist.append(kendalltau(x1, x2)[0])
To take the variance of all distances:
np.std(dist)
Or if you are looking for the variance as (discussed here)
then take the norm of the distance vector:
np.linalg.norm(dist)
Note I'm using a
as defined with np.array
, not np.matrix
:
a = np.array([[1,2,3,4,5,6], [2,3,4,6,1,5], [6,3,1,2,5,4]])