Solved – Support vector machines and regression

machine learningregressionsvm

There's already been an excellent discussion on how support vector machines handle classification, but I'm very confused about how support vector machines generalize to regression.

Anyone care to enlighten me?

Best Answer

Basically they generalize in the same way. The kernel based approach to regression is to transform the feature, call it $\mathbf{x}$ to some vector space, then perform a linear regression in that vector space. To avoid the 'curse of dimensionality', the linear regression in the transformed space is somewhat different than ordinary least squares. The upshot is that the regression in the transformed space can be expressed as $\ell(\mathbf{x}) = \sum_i w_i \phi(\mathbf{x_i}) \cdot \phi(\mathbf{x})$, where $\mathbf{x_i}$ are observations from the training set, $\phi(\cdot)$ is the transform applied to data, and the dot is the dot product. Thus the linear regression is 'supported' by a few (preferrably a very small number of) training vectors.

All the mathematical details are hidden in the weird regression done in the transformed space ('epsilon-insensitive tube' or whatever) and the choice of transform, $\phi$. For a practitioner, there are also questions of a few free parameters (usually in the definition of $\phi$ and the regression), as well as featurization, which is where domain knowledge is usually helpful.