I am using the following function:
kernel = @(X,Y,sigma) exp((-pdist2(X,Y,'euclidean').^2)./(2*sigma^2));
to compute a series of kernels, in the following way:
K = [(1:size(featureVectors,1))', kernel(featureVectors,featureVectors, sigma)];
However, since featureVectors
is a huge matrix (something like 10000x10000), it takes really a long time to compute the kernels (e.g., K
).
Is it possible to somehow speed up the computation?
EDIT: Context
I am using a classifier via libsvm
, with a gaussian kernel, as you may have noticed from the variable names and semantics.
I am using now (more or less) #terms~=10000
and #docs~=10000
. This #terms resulted after stopwords removal and stemming. This course indicates that having 10000
features makes sense.
Unfortunately, libsvm
does not implement automatically the Gaussian kernel. Thus, it is required to compute it by hand. I took the idea from here, but the kernel computation (as suggested by the referenced question) is really slow.
You are using pdist2
with two equal input arguments (X
and Y
are equal when you call kernel
). You could save half the time by computing each pair only once. You do that using pdist
and then squareform
:
kernel = @(X,sigma) exp((-squareform(pdist(X,'euclidean')).^2)./(2*sigma^2));
K = [(1:size(featureVectors,1))', kernel(featureVectors, sigma)];