I am attempting to use SVMs for my class project. For this project, I have selected the gaussian kernel as, well, the kernel. That is,
$$
k(\mathbf{x}_1, \mathbf{x}_n) = e^{-\gamma ||\mathbf{x}_1 – \mathbf{x}_n ||^2}
$$
What I do not understand, is how this kernel is then 'written as a dot-product'. How do we get around doing that? This is because my professor says that when we finalize the training, we will be performing a dot-product between a new vector and the SVs. But given this kernel, how is this dot-product being done?
Best Answer
Look up "kernel trick". The idea is that, under certain conditions (Mercer's condition), a function $k(x,x')$ can be expressed as a dot product $<\phi(x),~\phi(x')>$, where $\phi$ is a function that transforms $x$ into a high dimensional (possibly infinite) representation.
The trick is that, as long as your optimization problem can be expressed solely with dot products, you do not need to know or compute $\phi$, you simply use the kernel function $k$.
More details on Wikipedia