Number of features in SVM Kernel matrix

kernel trickmachine learningpythonsvm

I'm trying to implement the SVM algorithm manually, and I've succeeded in doing so with the case of a linear kernel (no kernel function).

Now I'd like to add a gaussian kernel to the algorithm, but I'm facing a size mismatch problem. The way I understand it, is that by applying your original examples to the kernel you create a new feature matrix. If I have m training examples then the new kernel matrix would become an (m×m) matrix regardless of how many features I had, and the weights vector would be of length (m+1).

Then when I apply a test set to algorithm with k examples and each example having n features, I get a dimension mismatch error: n != m+1

What am I missing?

Best Answer

The kernel matrix is $m\times m$ if you have $m$ training examples because it's the dot product matrix in the transformed space. But, weight vector has nothing to do with $m$ and its dimension is the original feature dimension (+1 if you embed the bias term inside the weights) when the kernel is linear. When it's not, you don't calculate the weights since you may not know the explicit transformation. The kernel is directly used in the task, e.g. an explicit formulation for binary classification.

Related Question