Solved – number of feature maps in convolutional neural networks

conv-neural-networkdeep learningmachine learningneural networkspattern recognition

When learning convolutional neural network, I have questions regarding the following figure.

1) C1 in layer 1 has 6 feature maps, does that mean there are six convolutional kernels? Each convolutional kernel is used to generate a feature map based on input.

2) S1 in layer 2 has 6 feature maps, C2 has 16 feature maps. What is the process look like to get these 16 feature maps based on 6 feature maps in S1?

enter image description here

Best Answer

1) C1 in the layer 1 has 6 feature maps, does that mean there are six convolutional kernels? Each convolutional kernel is used to generate a feature map based on input.

There are 6 convolutional kernels and each is used to generate a feature map based on input. Another way to say this is that there are 6 filters or 3D sets of weights which I will just call weights. What this image doesn't show, that it probably should, to make it clearer is that typically images have 3 channels, say red, green, and blue. So the weights that map you from the input to C1 are of shape/dimension 3x5x5 not just 5x5. The same 3 dimensional weights, or kernel, are applied across the entire 3x32x32 image to generate a 2 dimensional feature map in C1. There are 6 kernels (each 3x5x5) in this example so that makes 6 feature maps ( each 28x28 since the stride is 1 and padding is zero) in this example, each of which is the result of applying a 3x5x5 kernel across the input.

2) S1 in layer 1 has 6 feature maps, C2 in layer 2 has 16 feature maps. What is the process look like to get these 16 feature maps based on 6 feature maps in S1?

Now do the same thing we did in layer one, but do it for layer 2, except this time the number of channels is not 3 (RGB) but 6, six for the number of feature maps/filters in S1. There are now 16 unique kernels each of shape/dimension 6x5x5. each layer 2 kernel is applied across all of S1 to generate a 2D feature map in C2. This is done 16 times for each unique kernel in layer 2, all 16, to generate the 16 feature maps in layer 2 (each 10x10 since stride is 1 and padding is zero)

source: http://cs231n.github.io/convolutional-networks/