Solved – How to select landmarks of Kernel to run SVM

kernel tricksvm

Objective

Clarify how to choose the kernel reference points (landmarks) to identify the non-linear boundary.

enter image description here

Background

Going through SVM at Coursera ML – Support Vector Machine and trying to understand how to choose the landmarks to measure the distances to feed into the Gaussian Kernel.

It says "put the landmarks in the exact same locations as all the training examples".

enter image description here

enter image description here

Question

Not clear why "the exact same locations of all the training data".

  1. Why using all the data?

    Number of features M and the number of data N is different and I suppose M << N. Then should we choose M number of data to use landmarks?

  2. Why not considering if a data to use as a landmark is classified positive or negative?

    I believe we would like to distinguish positive data (higher Gaussian probability), then why use the negative data as well as the landmarks?

    In the YouTube SVM with polynomial kernel visualization example (although it does not use Gaussian), the landmarks should be those that represent red points?

Best Answer

M is the number of data points, not the number of features. So we take all our (training) data, and for each (xi,yi), we get a landmark.

Notice that in the combined minimisation term, each fi is combined with its matching yi, so the minimisation takes account of which landmarks should be positive and which should be negative.

In the video each red dot AND each blue dot should be a landmark.