Solved – Hyper-parameter optimization via random search

hyperparameteroptimization

I’m working on a classification system which consists of an auto-encoder for feature learning and logistic regression for classification. The system has five hyper-parameters as enumerated below.

  1. Number of features it's learning via auto-encoder
  2. Weight decaying parameter of the auto-encoder
  3. Weight decaying parameter of the logistic regression
  4. Sparsity parameter of the auto-encoder
  5. The weight of the sparsity penalty term of the auto-encode

I’m planning to use random search for obtaining optimum values for these parameters. More information about the random search for hyper-parameter optimization can be found in this paper

My question is, in order to perform a random search we need to identify the appropriate ranges for each hyper-parameter. For example, the weight decaying parameter of the auto-encoder belongs to [X, Y].

So do you know a published paper to extract these initial values?

Best Answer

For some of these parameters you can simply reason about what the range should be. For example, since the sparsity parameter is the desired average activation of the hidden units, this will only make sense if it is in $(0, 1)$, assuming you're using a sigmoid activation. However, you can still narrow this down farther as a sparsity $> 0.5$ is somewhat nonsensical. For other parameters your can look for the values people commonly report in the literature, use this as a rough guess, and expand your ranges if you find it necessary. For example I would probably start with $(0, 0.2)$ for the learning rate.

Another useful place to look would be Geoff Hinton's practical guide to training restricted Boltzmann machines. I imagine most of this advice would be applicable for autoencoders as well.