Solved – Support Vectors Not Falling on Margin Lines for e1071 and kernlab packages in R

In a previous thread, Computing the Decision Boundary of a linear SVM model(Computing the decision boundary of a linear SVM model), the following R code was given as a way to compute the formula of the hyperplane and it's margins, given an input set of hyperparameters:

library(kernlab)    
set.seed(101)
x <- rbind(matrix(rnorm(120),,2),matrix(rnorm(120,mean=3),,2))
y <- matrix(c(rep(1,60),rep(-1,60)))
svp <- ksvm(x,y,type="C-svc")
plot(svp,data=x)
alpha(svp)  # support vectors whose indices may be found with alphaindex(svp)
b(svp)      # (negative) intercept 
plot(scale(x), col=y+2, pch=y+2, xlab="", ylab="")
w <- colSums(coef(svp)[[1]] * x[unlist(alphaindex(svp)),])
b <- b(svp)
abline(b/w[1],-w[2]/w[1])
abline((b+1)/w[1],-w[2]/w[1],lty=2)
abline((b-1)/w[1],-w[2]/w[1],lty=2)

However, in the resulting plot, none of the support vectors actually fall on either margin line.

abline(b/w[1],-w[2]/w[1]) #is supposed to be the maximum-marginal hyperplane and
abline((b+1)/w[1],-w[2]/w[1],lty=2) #is supposed to be the yi = +1 margin and
abline((b-1)/w[1],-w[2]/w[1],lty=2) #is supposed to be the yi = -1 margin

then why don't the support vectors, alpha(svp), fall on either the +1 or the -1 margin?

Now before you say it's because kernelf = "rbfkernel" by default and our hyperplane and margins have not been transformed according to the rbfkernel, I can explicitly call the linear kernel with kernelf = "vanilladot" (i.e svp <- ksvm(x,y,type="C-svc",kernelf="vanilladot")) and the hyperplane and margins are even farther from the support vectors defined by alpha(svp) then when kernelf="rbfkernel".

Note that I find similar results when I use the e1071 package (which is perhaps not surprising as they both based on the LIBSVM library).

If I am doing something wrong, or missing a crucial point (like perhaps ksvm() uses a soft margin by default and is hiding the slack variables from me), please let me know!

Here is my code for the linear kernel with the kernlab package:

library(kernlab)
set.seed(101)
x <- rbind(matrix(rnorm(120),,2),matrix(rnorm(120,mean=3),,2))
y <- matrix(c(rep(1,60),rep(-1,60)))
svp <- ksvm(x,y,type="C-svc",kernelf="vanilladot")
plot(svp,data=x)
plot(x, col=y+2, pch=y+2, xlab="", ylab="")
w <- colSums(coef(svp)[[1]] * x[unlist(alphaindex(svp)),])
b <- b(svp)
plot(svp,data=x)
abline(b/w[1],-w[2]/w[1])
abline((b+1)/w[1],-w[2]/w[1],lty=2)
abline((b-1)/w[1],-w[2]/w[1],lty=2)

Best Answer

I am not an R user, but I suspect it is because you are using the soft-margin support vector machine (which is what I presume "C-svc" means). The support vectors will only lie exactly on the margins for the hard margin SVM (where C is infinite). Essentially the C parameter penalises the degree to which the support vectors are allowed to violate the margin constraint, so if C is less than infinity, the support vectors are allowed to drift away from the margins in the interests of making the margin broader, which often leads to better generalisation.

Best Answer

Related Solutions

Solved – Order of Support Vectors, and how to reduce them

Solved – Bounded and unbounded support vectors for nu-SVMs

Related Question