Solved – Using a gaussian kernel in SVM. How exactly is this then written as a dot product

kernel trickmachine learningsvm

I am attempting to use SVMs for my class project. For this project, I have selected the gaussian kernel as, well, the kernel. That is,

$$
k(\mathbf{x}_1, \mathbf{x}_n) = e^{-\gamma ||\mathbf{x}_1 – \mathbf{x}_n ||^2}
$$

What I do not understand, is how this kernel is then 'written as a dot-product'. How do we get around doing that? This is because my professor says that when we finalize the training, we will be performing a dot-product between a new vector and the SVs. But given this kernel, how is this dot-product being done?

Best Answer

Look up "kernel trick". The idea is that, under certain conditions (Mercer's condition), a function $k(x,x')$ can be expressed as a dot product $<\phi(x),~\phi(x')>$, where $\phi$ is a function that transforms $x$ into a high dimensional (possibly infinite) representation.

The trick is that, as long as your optimization problem can be expressed solely with dot products, you do not need to know or compute $\phi$, you simply use the kernel function $k$.

More details on Wikipedia