I am trying to figure out a fully vectorised way to compute the co-variance matrix for a 2D numpy array for a given base kernel function. For example if the input is X = [[a,b],[c,d]]
for a kernel function k(x_1,x_2)
the covariance matrix will be
K=[[k(a,a),k(a,b),k(a,c),k(a,d)],
[k(b,a),k(b,b),k(b,c),k(b,d)],
[k(c,a),k(c,b),k(c,c),k(c,d)],
[k(d,a),k(d,b),k(d,c),k(d,d)]]
.
how do I go about doing this? I am confused as to how to repeat the values and then apply the function and what might be the most efficient way of doing this.
You can use np.meshgrid
to get two matrices with values for the first and second parameter to the k
function.
In [8]: X = np.arange(4).reshape(2,2)
In [9]: np.meshgrid(X, X)
Out[9]:
[array([[0, 1, 2, 3],
[0, 1, 2, 3],
[0, 1, 2, 3],
[0, 1, 2, 3]]),
array([[0, 0, 0, 0],
[1, 1, 1, 1],
[2, 2, 2, 2],
[3, 3, 3, 3]])]
You can then just pass these matrices to the k
function:
In [10]: k = lambda x1, x2: (x1-x2)**2
In [11]: X1, X2 = np.meshgrid(X, X)
In [12]: k(X1, X2)
Out[12]:
array([[0, 1, 4, 9],
[1, 0, 1, 4],
[4, 1, 0, 1],
[9, 4, 1, 0]])