Solved – Back-propagation in Convolution layer

backpropagationconv-neural-network

Most examples I found on the internet explain well back-propagation in convolution layer, but only with a single kernel and single input channel.

I do not understand how to do back-propagation for more than one kernel and more than one input channel.

Let's say I have a convolution layer which accepts an input $X$ of size 3x20x20, applies 5 3x3x3 kernel filters $K$ and produces an output $O$ of a size 5x18x18

On a diagram it looks like that (I apologize for my horrible hand-drawing):
Forward Pass

During the backward pass the layer receives an error $\frac{\partial E}{\partial O}$ and propagates it back to the previous layer.

As I understand, in order to compute $\frac{\partial O}{\partial X}$ I need to apply 'full' convolution to $\frac{\partial E}{\partial O}$ with a kernels rotated 180°. So, it looks like this:

Backward Pass

The dimensions of $\frac{\partial O}{\partial X}$ should match the dimensions of $X$ (3x20x20), but convolution operation produces an output with a depth equals to the number of kernels (in my case 5).

My question is how a 'full' convolution between $\frac{\partial E}{\partial O}$ 5x18x18 over 5 rotated filters 3x3x3 can produce an output $\frac{\partial O}{\partial X}$ with dimension 3x20x20 ? Is not it that the depth of the output of convolution operation equals to the number of filters ?

Best Answer

I try to explain the dimensions obtained (5x18x18 -> 3x20x20):

  • 5 -> 3 the flipped convolutions are repeated 3 times, but the effects of each of the 5 filter are summed up, exactly as you do in the forward phase. In any case in a convolutional layer it is possible to give any depth in input and any number of filters in output as well. enter image description here

  • 18 -> 20 is given by the full convolution, in which is applied a padding to the input image obtaining then a bigger image as result.

Anyway here the backpropagation in convolution layers is very well explained.