Solved – Are pooling layers added before or after dropout layers

conv-neural-networkdeep learningdropout

I'm creating a convolutional neural network (CNN), where I have a convolutional layer followed by a pooling layer and I want to apply dropout to reduce overfitting. I have this feeling that the dropout layer should be applied after the pooling layer, but I don't really have anything to back that up. Where is the right place to add the dropout layer? Before or after the pooling layer?

Best Answer

Edit: As @Toke Faurby correctly pointed out, the default implementation in tensorflow actually uses an element-wise dropout. What I described earlier applies to a specific variant of dropout in CNNs, called spatial dropout:

In a CNN, each neuron produces one feature map. Since ~~dropout~~ spatial dropout works per-neuron, dropping a neuron means that the corresponding feature map is dropped - e.g. each position has the same value (usually 0). So each feature map is either fully dropped or not dropped at all.

Pooling usually operates separately on each feature map, so it should not make any difference if you apply dropout before or after pooling. At least this is the case for pooling operations like maxpooling or averaging.

Edit: However, if you actually use element-wise dropout (which seems to be set as default for tensorflow), it actually makes a difference if you apply dropout before or after pooling. However, there is not necessarily a wrong way of doing it. Consider the average pooling operation: if you apply dropout before pooling, you effectively scale the resulting neuron activations by 1.0 - dropout_probability, but most neurons will be non-zero (in general). If you apply dropout after average pooling, you generally end up with a fraction of (1.0 - dropout_probability) non-zero "unscaled" neuron activations and a fraction of dropout_probability zero neurons. Both seems viable to me, neither is outright wrong.

Best Answer

Related Solutions

Solved – Backpropagation between pooling and convolutional layers

Solved – How todentify Overfitting in Convolutional Neural network

Related Question