I'm learning about recurrent neural networks right now, and am in chapter 6 of Deep Learning with Python by Francois Chollet.
In the chapter it's discussing using dropout in recurrent layers. I understand the logic behind having the inputs randomized the same way at each time step since RNN's are used to learn sequence data, but I'm having a difficult time parsing some of the finer details between the dropout
and recurrent dropout
arguments you can pass in.
Take this simple example:
keras.layers.GRU(32, dropout=0.2, recurrent_dropout=0.2)
Whenever I see snippets like this on the internet both dropout arguments are usually set to the same value. Is this a best habit or just convention?
I'm assuming the dropout
argument is the fraction of inputs that will be zeroed out coming into the recurrent layer. If that's the case, what's the difference between my example and something like this:
keras.layers.Dropout(0.2)
keras.layers.GRU(32, recurrent_dropout=0.2)
Thank you for all of your help.
Best Answer
Your two snippets are equivalent. It is just syntactic sugar.
Btw. be careful about using the recurrent dropout. It usually makes things worse.