Solved – Comparing SVM and logistic regression

logisticoptimizationregressionsvm

Can someone please give me some intuition as to when to choose either SVM or LR? I want to understand the intuition behind what is the difference between the optimization criteria of learning the hyperplane of the two, where the respective aims are as follows:

SVM: Try to maximize the margin between the closest support vectors
LR: Maximize the posterior class probability

Let's consider the linear feature space for both SVM and LR.

Some differences I know of already:

SVM is deterministic (but we can use Platts model for probability score) while LR is probabilistic.
For the kernel space, SVM is faster (stores just support vectors)

Best Answer

Linear SVMs and logistic regression generally perform comparably in practice. Use SVM with a nonlinear kernel if you have reason to believe your data won't be linearly separable (or you need to be more robust to outliers than LR will normally tolerate). Otherwise, just try logistic regression first and see how you do with that simpler model. If logistic regression fails you, try an SVM with a non-linear kernel like a RBF.

EDIT:

Ok, let's talk about where the objective functions come from.

The logistic regression comes from generalized linear regression. A good discussion of the logistic regression objective function in this context can be found here: https://stats.stackexchange.com/a/29326/8451

The Support Vector Machines algorithm is much more geometrically motivated. Instead of assuming a probabilistic model, we're trying to find a particular optimal separating hyperplane, where we define "optimality" in the context of the support vectors. We don't have anything resembling the statistical model we use in logistic regression here, even though the linear case will give us similar results: really this just means that logistic regression does a pretty good job of producing "wide margin" classifiers, since that's all SVM is trying to do (specifically, SVM is trying to "maximize" the margin between the classes).

I'll try to come back to this later and get a bit deeper into the weeds, I'm just sort of in the middle of something :p

Best Answer

Related Solutions

Solved – Non-linear SVM classification with RBF kernel

Solved – Kernel logistic regression vs SVM

Related Question