Perceptron Mistakes

computer sciencediscrete mathematicsmachine learning

In this problem, we will investigate the perceptron algorithm with different iteration ordering.

Consider applying the perceptron algorithm through the origin based on a small training set containing three points:

$x^{(1)} =[-1,-1], y^{(1)}=1$
$x^{(2)} =[1,0], y^{(2)}=-1$
$x^{(3)} =[-1, 1.5], y^{(3)}=1$
Given that the algorithm starts with $θ^{(0)}=0$, the first point that the algorithm sees is always considered a mistake. The algorithm starts with some data point and then cycles through the data (in order) until it makes no further mistakes.

Now assume that $x^{(3)}=[−1,10]$. How many mistakes does the algorithm make until convergence if cycling starts with data point x(1)?

Also provide the progression of the separating plane as the algorithm cycles in the following list format: $[[θ^{(1)}1,θ^{(1)}2],…,[θ^{(N)}1,θ^{(N)}2]]$, where the superscript denotes different θ as the separating plane progresses. For example, if $θ$ progress from $[0,0]$ (initialization) to $[1,2]$ to $[3,−2]$, you should enter $[[1,2],[3,−2]]$

number of mistakes of Perceptron algorithm if the algorithm starts with $x^{(1)}$
= $2$

the progression of the separating hyperplane of the Perceptron algorithm if the algorithm starts with $x^{(1)}$. = $[[-1,-1], [-2, 9]]$

However, my answer seems wrong and input is a matrix of incorrect shape Does anyone has an idea what I am doing wrong does I make mistake while calculating the perceptron mistakes?

Best Answer

Your answer the progression of the separating hyperplane of the Perceptron algorithm if the algorithm starts with x(1). = [[−1,−1],[−2,9]] is incomplete because in your algorithm you run a loop of n = 3 times only.

Now, since the different training examples might update the parameters in different directions, it is possible that the later updates actually make the earlier undo some of the earlier updates, and some of the earlier examples are no longer correctly classified.

So we have to go through the training set here multiple times, say, T times, in order to get the maximum number of times updating. T should be as high as it takes until there would be no difference in the number of times updating after a certain T.

With a sufficiently large T, if there exists a linear classifier through origin that correctly classifies the training samples, this algorithm actually will find a solution to that problem:

Perceptron Through Origin:

initialize theta = 0
for t from 1 to T
    for i from 1 to n:
        if (y[i] * (x[i] * theta) <= 0) then,
            theta = theta + y[i] * x[i]
return theta

With your problem, try run it as many as T = 9 times or 100 times you will get the following:

theta update: [-1 -1]

theta update: [-2 9]

theta update: [-3 8]

theta update: [-4 7]

theta update: [-5 6]

theta update: [-6 5]

mistakes: 6

Related Question