How Are Kernel Weights Trained in 1-D CNN’s with Multi-dimensional Input

conv-neural-networkmachine learningneural networks

I have far from a perfect understanding of how 1-D convolution neural networks learn, but I think I understand how the kernel operates on 1-D input data. How does 1-D convolution work with multi-dimensional input data? An image from this article:

enter image description here

has led me to think that the "kernel_size" argument in tensorflow.keras Conv1D layers actually controls the width of the kernel (which seems obvious with 1-D input data) but that the kernels themselves are as "deep" as the number of rows in the input data. Is this understanding correct? If so, what is the relationship between the weights in the "rows" of the kernel? Are they all the same, different but related in some way, or entirely different? Please let me know if my understanding seem off-base, and thanks in advance for your help!

Best Answer

You're correct that the kernel_size is the "width" in this diagram, and that there are 3 channels in this kernel (the "depth"). There's usually no relationship between the weights for each "row", besides that they are the same size/width.

There has been a proposed "depthwise separable convolution" where all the rows are constrained to be scalar multiples of each other, but that's not a common technique.

Related Question