I am trying to implement the back-propagation of a simple convolutional network. Specifically I understand that one of the steps is the convolution of the gradients coming from the next layer, with the rotated kernels. However I cannot compute the parameters of this convolution. Let's see this with an example:
Imagine that the layer's input is an image of size (heigh, width) == (4, 4)
with only 1 color channel. We also use zero-padding == 2
and stride == 1
. Our kernel (let's say we only have 1 kernel) is of size (filterHeight, filterWidth) == (2, 2)
. We can now use the following equation to compute each dimension of the output:
out = (in - filter + 2*padding) / stride + 1
(1)
So the output volume has a (height, width) == (7, 7)
and everything is fine.
Now comes the time for the backward pass. I need to convolve this (7, 7)
tensor with the rotated (2, 2)
weights and get a tensor of (4, 4)
. How can I compute the stride and padding? I only have one equation (1) but two unknowns! Therefore there isn't a single solution for stride
and padding
!
I can heuristically find that the combination of padding == 2
and stride == 3
will work out, but why choose this over the potentially infinite number of valid combinations?
Best Answer
As far as I know you would need to perform a "full" convolution during the backpropagation step. So the gradients from the l+1 layer will be a
(7, 7)
tensor. The "full" convolution with the rotated filter(2, 2)
will result in a(8, 8)
tensor. Removing the original padding as added to your input(4, 4)
would again result in a(4, 4)
tensor. There would be no need to calculate or guess any further padding/stride while doing the backprop convolution.