Solved – Neural Network Forward Propagation

machine learningneural networks

I'm trying to solve this neural network problem found here:

http://i.stack.imgur.com/yRznl.png

How do I go ahead and calculate the forward propogate in this example? I've see examples of how to calculate the expected output but that is given here, and I'm note quite sure what I even need to do or start doing to calculate the forward propagate

Best Answer

Forward propogation is simply multiplying input with weights and add bias before applying activation fuction (sigmoid in here) at each node. There is no bias in this question.

$ W^{(1)}*x = z^{(1)} = \begin{bmatrix} \ W_{11}^{(1)} & \ W_{12}^{(1)} \\[0.3em] \ W_{21}^{(1)} & \ W_{22}^{(1)} \end{bmatrix} * \begin{bmatrix} \ x_1 \\[0.3em] \ x_2 \end{bmatrix} = \begin{bmatrix} \ 0.5 & \ 0.1 \\[0.3em] \ 0.25 & 0.75 \end{bmatrix} \begin{bmatrix} \ 1 \\[0.3em] \ 0 \end{bmatrix} = \begin{bmatrix} \ 0.5 \\[0.3em] \ 0.25 \end{bmatrix}$

$ a^{(2)}= sigm(z^{(1)}) = sigm(\begin{bmatrix} \ 0.5 \\[0.3em] \ 0.25 \end{bmatrix}) = \begin{bmatrix} \ 0.6225 \\[0.3em] \ 0.5622 \end{bmatrix} $

$ W^{(2)}*a^{(2)} = z^{(2)} = \begin{bmatrix} \ W_{11}^{(2)} & \ W_{12}^{(2)} \end{bmatrix} * \begin{bmatrix} \ a^{(2)}_1 \\[0.3em] \ a^{(2)}_2 \end{bmatrix} = \begin{bmatrix} \ 0.95*0.6225 + 0.5622*1.0 \end{bmatrix} = 1.1536 $

$ a^{(3)}= sigm(z^{(2)}) = sigm(1.1536) = 0.7602 $

This is your output, and assume that your cost function is

$ C = \frac{1}{2}(a^{(3)} -y )^2$

where y is expected output = 0.5, and output error term derived as,

$ δ^{(3)} = \frac{dC}{dz^{(2)}} = (a^{(3)} -y ).* a^{(3)}.*(1-a^{(3)}) = (0.7602 - 0.5) .*0.7602.*(1-0.7602) = 0.0474$

where '.*' is element-wise product and $a^{(3)}.*(1-a^{(3)})$ comes from derivation of sigmoid. I've assumed that error term calculated with respect to $ z$,not $ a$. If that is the case the derivation changes a little bit. Now back propogate $ δ^{(3)}$, to find $ δ_{2}^{(2)}$

$ δ_{2}^{(2)} = \frac{dC}{dz^{(2)}} * \frac{dz^{(2)}}{dz_{2}^{(1)}} = δ^{(3)} * \frac{dz^{(2)}}{dz_{2}^{(1)}} $

let's drive the second term before we continue

$ \frac{dz^{(2)}}{dz_{2}^{(1)}} = W_{12}^{(2)}.*a_{2}^{(2)}.*(1-a_{2}^{(2)})$ from $ z^{(2)}= W^{(2)}*sigm(z^{(1)}) $

now we can evaluate previous equation, $ δ_{2}^{(2)} = \frac{dC}{dz^{(2)}} * \frac{dz^{(2)}}{dz_{2}^{(1)}} = δ^{(3)} * W_{12}^{(2)}.*a_{2}^{(2)}.*(1-a_{2}^{(2)}) = 0.0474 * 1.0 * 0.5622 * (1-0.5622) = 0.0117 $

You can see that how error term diminishes quickly during back propogation if we use sigmoid activation(or hyperbolic tangent).