Solved – Decision boundary plot for a perceptron

decision-theorymachine learningneural networksperceptronpython

I am trying to plot the decision boundary of a perceptron algorithm and I am really confused about a few things. My input instances are in the form $[(x_{1},x_{2}), y]$, basically a 2D input instance ($x_{1}$ and $x_{2}$) and a binary class target value ($y$) [1 or 0].

My weight vector hence is in the form: $[w_{1}, w_{2}]$.

Now I have to incorporate an additional bias parameter $w_{0}$ and hence my weight vector becomes a $3 \times 1$ vector? is it $1 \times 3$ vector? I think it should be $1 \times 3$ since a vector has only 1 row and n columns.

Now let's say I instantiate $[w_{0}, w_{1}, w_{2}]$ to random values, how would I plot the decision boundary for this? Meaning what does $w_{0}$ signify here? Is $w_{0}/norm(w)$ the distance of the decision region from the origin? If so how do I capture this and plot it in Python using matplotlib.pyplot or its Matlab equivalent?

I would really appreciate even a little help regarding this matter.

Best Answer

The way the perceptron predicts the output in each iteration is by following the equation:

$$y_{j} = f[{\bf{w}}^{T} {\bf{x}}] = f[\vec{w}\cdot \vec{x}] = f[w_{0} + w_{1}x_{1} + w_{2}x_{2} + ... + w_{n}x_{n}]$$

As you said, your weight $\vec{w}$ contains a bias term $w_{0}$. Therefore, you need to include a $1$ in the input to preserve the dimensions in the dot product.

You usually start with a column vector for the weights, that is, a $n \times 1$ vector. By definition, the dot product requires you to transpose this vector to get a $1 \times n$ weight vector and to complement that dot product you need a $n \times 1$ input vector. That's why a emphasized the change between matrix notation and vector notation in the equation above, so you can see how the notation suggests you the right dimensions.

Remember, this is done for each input you have in the training set. After this, update the weight vector to correct the error between the predicted output and the real output.

As for the decision boundary, here is a modification of the scikit learn code I found here:

import numpy as np
from sklearn.linear_model import Perceptron
import matplotlib.pyplot as plt

X = np.array([[2,1],[3,4],[4,2],[3,1]])
Y = np.array([0,0,1,1])
h = .02  # step size in the mesh


# we create an instance of SVM and fit our data. We do not scale our
# data since we want to plot the support vectors

clf = Perceptron(n_iter=100).fit(X, Y)

# create a mesh to plot in
x_min, x_max = X[:, 0].min() - 1, X[:, 0].max() + 1
y_min, y_max = X[:, 1].min() - 1, X[:, 1].max() + 1
xx, yy = np.meshgrid(np.arange(x_min, x_max, h),
                     np.arange(y_min, y_max, h))

# Plot the decision boundary. For that, we will assign a color to each
# point in the mesh [x_min, m_max]x[y_min, y_max].
fig, ax = plt.subplots()
Z = clf.predict(np.c_[xx.ravel(), yy.ravel()])

# Put the result into a color plot
Z = Z.reshape(xx.shape)
ax.contourf(xx, yy, Z, cmap=plt.cm.Paired)
ax.axis('off')

# Plot also the training points
ax.scatter(X[:, 0], X[:, 1], c=Y, cmap=plt.cm.Paired)

ax.set_title('Perceptron')

which produces the following plot:

enter image description here

Basically, the idea is to predict a value for each point in a mesh that covers every point, and plot each prediction with an appropriate color using contourf.