Solved – Support Vectors Not Falling on Margin Lines for e1071 and kernlab packages in R

e1071libsvmmachine learningrsvm

In a previous thread, Computing the Decision Boundary of a linear SVM model(Computing the decision boundary of a linear SVM model), the following R code was given as a way to compute the formula of the hyperplane and it's margins, given an input set of hyperparameters:

x <- rbind(matrix(rnorm(120),,2),matrix(rnorm(120,mean=3),,2))
y <- matrix(c(rep(1,60),rep(-1,60)))
svp <- ksvm(x,y,type="C-svc")
alpha(svp)  # support vectors whose indices may be found with alphaindex(svp)
b(svp)      # (negative) intercept 
plot(scale(x), col=y+2, pch=y+2, xlab="", ylab="")
w <- colSums(coef(svp)[[1]] * x[unlist(alphaindex(svp)),])
b <- b(svp)

However, in the resulting plot, none of the support vectors actually fall on either margin line.


abline(b/w[1],-w[2]/w[1]) #is supposed to be the maximum-marginal hyperplane and
abline((b+1)/w[1],-w[2]/w[1],lty=2) #is supposed to be the yi = +1 margin and
abline((b-1)/w[1],-w[2]/w[1],lty=2) #is supposed to be the yi = -1 margin

then why don't the support vectors, alpha(svp), fall on either the +1 or the -1 margin?

Now before you say it's because kernelf = "rbfkernel" by default and our hyperplane and margins have not been transformed according to the rbfkernel, I can explicitly call the linear kernel with kernelf = "vanilladot" (i.e svp <- ksvm(x,y,type="C-svc",kernelf="vanilladot")) and the hyperplane and margins are even farther from the support vectors defined by alpha(svp) then when kernelf="rbfkernel".

Note that I find similar results when I use the e1071 package (which is perhaps not surprising as they both based on the LIBSVM library).

If I am doing something wrong, or missing a crucial point (like perhaps ksvm() uses a soft margin by default and is hiding the slack variables from me), please let me know!

Here is my code for the linear kernel with the kernlab package:

x <- rbind(matrix(rnorm(120),,2),matrix(rnorm(120,mean=3),,2))
y <- matrix(c(rep(1,60),rep(-1,60)))
svp <- ksvm(x,y,type="C-svc",kernelf="vanilladot")
plot(x, col=y+2, pch=y+2, xlab="", ylab="")
w <- colSums(coef(svp)[[1]] * x[unlist(alphaindex(svp)),])
b <- b(svp)

Best Answer

I am not an R user, but I suspect it is because you are using the soft-margin support vector machine (which is what I presume "C-svc" means). The support vectors will only lie exactly on the margins for the hard margin SVM (where C is infinite). Essentially the C parameter penalises the degree to which the support vectors are allowed to violate the margin constraint, so if C is less than infinity, the support vectors are allowed to drift away from the margins in the interests of making the margin broader, which often leads to better generalisation.

