Solved – How are kernels applied to feature maps to produce other feature maps

conv-neural-networkdeep learningmachine learningneural networks

I am trying to understand the convolution part of convolutional neural networks. Looking at the following figure:

enter image description here

I have no problems understanding the first convolution layer where we have 4 different kernels (of size $k \times k$), which we convolve with the input image to obtain 4 feature maps.

What I do not understand is the next convolution layer, where we go from 4 feature maps to 6 feature maps. I assume we have 6 kernels in this layer (consequently giving 6 output feature maps), but how do these kernels work on the 4 feature maps shown in C1? Are the kernels 3-dimensional, or are they 2-dimensional and replicated across the 4 input feature maps?

Best Answer

The kernels are 3-dimensional, where width and height can be chosen, while the depth is equal to the number of maps in the input layer - in general.

They are certainly not 2-dimensional and replicated across the input feature maps at the same 2D location! That would mean a kernel wouldn't be able to distinguish between its input features at a given location, since it would use one and the same weight across the input feature maps!