Solved – Understanding the effect of hyperparameters in machine learning experiments

data mininghyperparametermachine learningoptimization

In machine learning every algorithm has a set of hyperparameters which needs to be optimized for best prediction performance. The simplest method for this optimization is called grid search which means to try all possible parameter value combinations. In this way one can find the best parameter values.

However, this does not give any insight about the interaction of parameter values. For example suppose that we have 5 hyperparameters p1, p2, p3, p4, and p5. There might be these kind of facts: as the value of p1 increases given that p3's value is low prediction performance increases. However, if p3's value is high then p1's value has no effect. There might be many more interesting facts like these. Is there any method for finding these kind of facts?

Best Answer

Many dedicated optimization methods exist for hyperparameter tuning. Sequential model based optimization (a Bayesian inspired method) is a particularly popular research topic, for instance here. Metaheuristic approaches like genetic algorithms, particle swarm optimization and simulated annealing are also common, see for instance here.

If you want to model the effect of hyperparameters, random search is a good sampling strategy to start from.

You can find implementations of such optimization methods in tuning libraries like Optunity, HyperOpt and Spearmint.