Solved – How does the $\phi(x_i)$ function look for Gaussian RBF kernel

kernel tricksvm

I am trying to write programs for simple SVM cases. And what I am stuck at is that I am unable to find $\phi(x_i)$ functions for given kernels.

For example there is Gaussian Radial Basis Function (RBF) kernel:

$$K(x_i,x_j)=\mathrm{exp}\left( \|x_j-x_i\|^2\right),$$

but to compute SVM's weights I also need $\phi(x_i)$, as

$$w=\sum_{i=1}^{N_{SV}} \alpha_i d_i \phi(x_i).$$

Is there something I am missing, as in not even one book I have found $\phi(x_i)$ for a given $K$?

Best Answer

You are missing one thing, namely the fact that we do not need to know the images of data instances in feature space $\phi(\mathbf{x}_i)$. For some kernel functions, the feature space is very complex/unknown (for instance some graph kernels), or infinite dimensional (for example the RBF kernel).

Kernel methods only need to be able to compute inner products between two images in feature space, e.g. $\kappa(\mathbf{x}_i,\mathbf{x}_j)=\langle\phi(\mathbf{x}_i),\phi(\mathbf{x}_j)\rangle$. We don't have to know the feature space to be able to compute inner products in it. This is called the kernel trick.

For an SVM, specifically, $\mathbf{w}$ is the separating hyperplane in feature space. You cannot always write this down in input space. Again, for the RBF kernel $\mathbf{w}$ resides in an infinite dimensional feature space. All we need to be able to do is compute the inner product of $\mathbf{w}$ and the image of the test instance $\mathbf{z}$ in feature space $\phi(\mathbf{z}$), which is:

$$\langle\mathbf{w},\phi(\mathbf{z})\rangle = \sum_{i\in SV}\alpha_i y_i \kappa(\mathbf{x}_i,\mathbf{z}).$$

SVMs exploit the so-called representer theorem, which states that the resulting models can always be expressed as a weighted sum of kernel evaluations between some training instances (the support vectors) and the test instance. This is in fact exploited by all kernel methods.

The RBF kernel maps onto an infinite dimensional feature space. For a writeup on this you may consult these slides by Chih-Jen Lin, particularly slides 10 and 11. For a one-dimensional $x$:

$$\phi_{RBF}(x) = e^{-\gamma x^2}\big[1,\sqrt{\frac{2\gamma}{1!}}x, \sqrt{\frac{(2\gamma)^2}{2!}}x^2, \sqrt{\frac{(2\gamma)^3}{3!}}x^3,\ldots\big]^T,$$

which follows from the Taylor expansion of the exponential function.

Best Answer

Related Solutions

Solved – How to prove that the radial basis function is a kernel

Solved – Non-linear SVM classification with RBF kernel

Related Question