Solved – CNN filters with different size using Keras

conv-neural-networkkerasmachine learningnatural languageneural networks

CNN can have multiple number of filters on raw input data. Normally I specify the number of filters needed as 'filters= 250 ' and the size of the filter as 'kernel_size= 3'. (This means I will make 250 filters and each filter has a window width 3 as this is for text). I also learned that theoretically these filters can be all in different sizes. So my questions are:

  1. Is it conventional/in most cases that people use consistent size of filters in CNN?

  2. Is there some option in Keras that I can set for having different sizes of filters in the same CNN? e.g. the 1st filter is width 3, 2nd filter is width 5, 3rd filter is width 7…etc.

Best Answer

Ad different sizes:

The rule of thumb I observed is that shallow networks (https://www.aclweb.org/anthology/D14-1181.pdf) use multiple kernel widths, deep networks (https://arxiv.org/pdf/1705.03122.pdf, https://arxiv.org/pdf/1711.04352.pdf) use typically kernel width 3.

I guess the reason is that with a deeper network, you get a broader receptive field by stacking the smaller filters. With a kernel size 3, you get a receptive field of 3 tokens at the first layers, but on the second layer, you already get 5, etc.

Ad Keras:

I would create separate Conv1D layers (be careful to use padding, so all outputs have the same length) and they use Concatenate layer merge them together.

Related Question