Solved – SVM options in scikit-learn

machine learningpythonscikit learnsvm

Just curious about two options in the scikits SVM class.

Does anyone know what scale_C and shrinking do? Not much in the documentation unfortunately.

Best Answer

I realize this is a super old question, but I ran into this same thing today, and found this document. Section 7.3, which describes shrinkage as implemented in libSVM (around which sklearn's SVM is a wrapper), begins with the following useful blurb:

The shrinking technique reduces the size of the problem by temporarily eliminating variables α_i that are unlikely to be selected in the SMO working set because they have reached their lower or upper bound (Joachims, 1999). The SMO iterations then continues on the remaining variables. Shrinking reduces the number of kernel values needed to update the gradient vector (see algorithm 6.2, line 8). The hit rate of the kernel cache is therefore improved.

So basically libSVM optimizes over a subset of the Lagrange multipliers α_i. As many of these are typically zero in a given problem, this is often a safe heuristic to adopt. Much more information can be found in the linked document.