Neural Networks and SVM – Is a Single Layer Neural Network with ReLU Activation Equivalent to SVM

neural networkssvm

Suppose I have a simple single layer neural network, with n inputs and a single output (binary classification task). If I set the activation function in the output node as a sigmoid function- then the result is a Logistic Regression classifier.

In this same scenario, if I change the output activation to ReLU (rectified linear unit), then is the resulting structure same as or similar to an SVM?

If not why?

Best Answer

Maybe what makes you think of ReLU is the hinge loss $E = max(1-ty,0)$ of SVMs, but the loss does not restrict the output activation function to be non-negative (ReLU).

For the network loss to be in the same form as SVMs, we can just remove any non-linear activation functions off the output layer, and use the hinge loss for backpropagation.

Moreover, if we replace the hinge loss with $E = ln (1 + exp(−ty))$ (which looks like a smooth version of hinge loss), then we'll be doing logistic regression as typical sigmoid + cross-entropy networks. It can be thought of as moving the sigmoid function from the output layer to the loss.

So in terms of loss functions, SVMs and logistic regression are pretty close, though SVMs use a very different algorithm for training and inference based on support vectors.

There's a nice discussion on the relation of SVM and logistic regression in section 7.1.2 of the book Pattern Recognition and Machine Learning.

enter image description here

Related Question