Solved – Tips for training dropout neural network

deep learningdropoutneural networkstrain

I use NN for my mini project research, and I found out the newest trick for feed forward NN is using dropout for regularization instead of L1/L2 norm and rectified linear unit as an activation function.

But when I tried it, I always got worse results compared to a standard NN with sigmoid / hyperbolic tangent activation function.

Is there some rule of thumb or trick that we can use for training dropout ReLU NN?

Best Answer

I am posting quite late, but I wanted to provide an answer just in case someone else has this problem.

Check that you are turning off dropout when you are evaluating on the validation/test set or if you want to compute error on the training set. Dropout was designed with the express intent of reducing overfitting, so if you are evaluating training loss with dropout turned on, you may see a higher training error.

For those familiar with the Lasagne framework built on top of Theano, there is an option something like: "get_output(net, deterministic = True)" (something like this, I forget exactly) where it does a deterministic forward pass, turning off Dropout and not performing any sort of noise injection.

Best Answer

Related Solutions

Solved – How does rectilinear activation function solve the vanishing gradient problem in neural networks

Solved – Alternatives to L1, L2 and Dropout generalization

Related Question