machine-learning svm machine-learning-model kernel-methods

SVM Kernel's O(n) time

I can't understand how computing kernel, K(x,z) takes only linear time despite working in O(n^d) dimensional space. Please explain me. I'm a newbie in ML.

Solution

In order to compute K(x, z), you have to:

Perform O(n) multiplications x1 * z1, x2 * x2, ..., xn * zn,
Perform O(n) additions (x1 * z1) + (x2 * x2) + ... + (xn * zn),
Perform two O(1) operations _ + c and _ ^ d.

Thus, computing K(x, z) = (dot(x, z) + c)^d takes O(n) time.

It's completely normal that the feature space has much higher dimension than the time it takes to compute the kernel: otherwise we wouldn't need the kernels in the first place, because we could just compute the feature vectors directly.

If you want a more extreme example, take a look at K(x, y) = min(x, y) on non-negative real numbers x, y. It takes constant time to evaluate min(x, y). However, the feature space is L^2(R) (square-integrable functions on the real line, with the standard Hilbert-space scalar product), and the feature mapping is phi(x) = chi_{[0, x]}, where chi_{[0, x]} denotes the characteristic function of the interval [0, x]. Thus, the feature space is infinite-dimensional, but the time it takes to evaluate the kernel is constant.