Solved – Quiz: Tell the classifier by its decision boundary

classificationmachine learningneural networksself-studysvm

Given are the 6 decision boundaries below. Decision boundaries is violett lines. Dots and crosses are two different data sets. We have to decide which one is a:

  • Linear SVM
  • Kernelized SVM (Polynomial kernel of order 2)
  • Perceptron
  • Logistic Regression
  • Neural Network (1 hidden layer with 10 rectified linear units)
  • Neural Network (1 hidden layer with 10 tanh units)

I would like to have the solutions. But more importantly, understand the differences. For instance I would say c) is a linear SVM. The decision boundary is linear. But also we can homogenize the coordinates of linear SVM decision boundary. d) Kernelized SVM, since it is polynomial order 2. f) rectified Neural Network due to the "rough" edges. Maybe a) logistic regression: It is also linear classifier, but based on probabilities.

Decision boundaries

Best Answer

Really like this question!

First thing that comes to mind is the division between linear and non-linear classifiers. Three classifiers are linear (linear svm, perceptron and logistic regression) and three plots show a linear decision boundary (A, B, C). So lets start with those.

Linear

The most sallient linear plot is plot B because it has a line with a slope. This is odd for logistic regression and svm because they can improve their loss-functions more by being a flat line (i.e. being further away from (all) the points). Thus, plot B is the perceptron. Since the perceptron ouput is either 0 or 1, all the solutions that seperate one class from the other are equally good. That is why it does not improve any further.

The difference between plot _A) and C is more subtle. The decision boundary is slightly lower in plot A. A SVM as a fixed number of support vectors while the loss function of logistic regression is determined all the points. Since there are more red crosses than blue dots logistic regression avoids the red crosses more than the blue dots. The linear SVM just tries to be as far away from the red support vectors as from the blue support vectors. That's why plot A is the decision boundary of logistic regression and plot C is made using a linear SVM.

Non-linear

Lets continue with the non-linear plots and classifiers. I agree with your observation that plot F is probably the ReLu NN since it has the sharpest boundaries. A ReLu unit because activated at once if the activation exceeds 0 and this causes the output unit to follow a different linear line. If you look really, really good you can spot about 8 changes of direction in the line so probably 2 units have little impact on the final outcome. So plot F is the ReLu NN.

About the last two ones I am not so sure. Both a tanh NN and the polynomial kernelized SVM can have multiple boundaries. Plot D is obviously classified worse. A tanh NN can improve on this situation by bending the curves differently and putting more blue or red points in the outer region. However, this plot is kind of strange though. I guess the left upper part is classified as red and the right lower part as blue. But how is the middle part classified? It should be red or blue, but then one of the decision boundary shouldn't been drawn. The only possible option is thus that the outer parts are classified as one color and the inner part as the other color. That's strange and really bad. So I am not sure about this one.

Let's look at plot E. It has both curved and straight lines. For a degree-2 kernelized SVM it is difficult (close to impossible) to have a straight line decision boundary since the squared distance gradually favors 1 of the 2 classes. The tanh activations functions hover can get saturated such that the hidden state is composed of 0's and 1's. In the case then only 1 unit then changes its state to say .5 you can get a linear decision boundary. So I would say that plot E is a tanh NN and thus plot D is a kernelized SVM. To bad for the poor old SVM though.

Conclusions

A - Logistic Regression
B - Perceptron
C - Linear SVM
D - Kernelized SVM (Polynomial kernel of order 2)
E - Neural Network (1 hidden layer with 10 tanh units)
F - Neural Network (1 hidden layer with 10 rectified linear units)