MATLAB: Understanding and applying results of bayesopt

bayesian optimizationbayesopthyperparameter optimizationmachine learningpredictive accuracyStatistics and Machine Learning Toolbox

Hi,

I have some difficulties understanding the Matlab documentation of the bayesopt function.

For example, the bestPoint function offers a couple of "best points" of a Bayesian optimization result. Which one should be used in order to get the best out-of-sample predictive accuracy?

Let's say I let bayesopt find the "best" hyperparameters for a regression tree ensemble (by actually using fitrensemble directly instead of the bayesopt function) and obtain the following result graphs:

What do both graphs (if at all) tell about the "best point", convergence, predictive accuracy etc. (generally, but also considering especially this example)? Are there any sources that explain these concepts, at least at a higher level, so that I can better make use of bayesopt?

Best Answer

It looks like no new minima are being found, and that the model of the objective function is stabilizing, but it's not a good model. The model has minima that are negative. A negative value for log(1+Loss) implies that Loss<0, which is impossible for MSE loss.

I've seen this happen when there is a steep "cliff" in the objective function (over hyperparameter space). The Gaussian Process model of that function smooths out the cliff and thereby undershoots the true function (and zero) at the base of the cliff. In fact, the reason that the objective function when optimizing regression fit functions is defined as log(1+Loss) instead of Loss, is to try to reduce the size of such cliffs to reduce the chance of overshoots like this.

To diagnose this, you could look at the values of the objective function that are being found, to see if they differ by orders of magnitude.

Regarding bestPoint, since the model is not giving a resonable estimate of the minimum of the objective function, it would probably be better to trust the minimum observed point, and use the 'min-observed' criterion.

Related Solutions

MATLAB: Finding optimal regression tree using hyperparameter optimization

My guess is that your first run was worse because it was not run for enough iterations. The default MaxObjectiveEvaluations is 30 iterations, but since your first optimization searches a larger space (including a categorical variable) you should probably multiply that a few times. You're also using 'Repartition'=true which calls for more iterations. Try running it for at least 100 iterations. The more the better as time permits. You can pass MaxObjectiveEvaluations inside HyperparameterOptimizationOptions.

The objective being minimized for regression is log(1 + MSE) computed on the validation set. By default that's 5-fold crossvalidation. That's mentioned near the bottom of the OptimizeHyperparameters section on this doc page: http://www.mathworks.com/help/stats/fitrensemble.html#input_argument_d0e360201 Your final calls to kfoldLoss will return MSE, which will differ from the objective function values.

In any case, you should use the model that has the lowest cross-validated MSE no matter how you found it.

MATLAB: How to retrieve optimal MinLeafSize after automatic hyperparameter optimization for Tree Ensemble (fitrensemble)

I ran both a tree and ensemble models optimizing minLeafSize. For a decision tree, MinLeaf is a model parameter, but not for an ensemble. The only way I could find to see the value was by viewing the template.

Mdl.ModelParameters.LearnerTemplates{1,1}
ans = 
Fit template for regression Tree.
        SplitCriterion: []
             MinParent: []
               MinLeaf: 126
             MaxSplits: 10
          NVarToSample: []
           MergeLeaves: 'off'
                 Prune: 'off'
        PruneCriterion: []
               QEToler: []
            NSurrogate: []
                MaxCat: []
                AlgCat: []
    PredictorSelection: []
          UseChisqTest: []
                Stream: []
          Reproducible: []
               Version: 2
                Method: 'Tree'
                  Type: 'regression'

Best Answer

Related Solutions

MATLAB: Finding optimal regression tree using hyperparameter optimization

MATLAB: How to retrieve optimal MinLeafSize after automatic hyperparameter optimization for Tree Ensemble (fitrensemble)

Related Question