I am calculating propensity scores using fitrensemble. I am interested in finding the tree with the lowest test RMSE (as I am using the resulting model to predict outcomes in a very large second dataset). I am currently using hyperparameter optimization to find the optimal tree using the below code:
% Optimize for model
rng default propensity_final = fitrensemble(X,Y,... 'Learner',templateTree('Surrogate','on'),... 'Weights',W,'OptimizeHyperparameters',{'Method','NumLearningCycles','MaxNumSplits','LearnRate'},... 'HyperparameterOptimizationOptions',struct('Repartition',true,... 'AcquisitionFunctionName','expected-improvement-plus')); loss_final = kfoldLoss(crossval(propensity_final,'kfold',10));
However, I find that when not optimizing for the model, hence doing one of the below, the cross-validation error is lower.
% Bagged
propensity1_bag = fitrensemble(X,Y,... 'Method','Bag',... 'Learner',templateTree('Surrogate','on'),... 'Weights',W,'OptimizeHyperparameters',{'NumLearningCycles','MaxNumSplits'},... 'HyperparameterOptimizationOptions',struct('Repartition',true,... 'AcquisitionFunctionName','expected-improvement-plus')); loss1_bag = kfoldLoss(crossval(propensity1_bag,'kfold',10)); % LSBoost
propensity1_boost = fitrensemble(X,Y,... 'Method','LSBoost',... 'Learner',templateTree('Surrogate','on'),... 'Weights',W,'OptimizeHyperparameters',{'NumLearningCycles','MaxNumSplits','LearnRate'},... 'HyperparameterOptimizationOptions',struct('Repartition',true,... 'AcquisitionFunctionName','expected-improvement-plus')); loss1_boost = kfoldLoss(crossval(propensity1_bag,'kfold',10));
What is the objective (best so far and estimated) that the function tries to minimize? And why are loss1_boost and loss1_bag lower than loss_final? How do I know which model to use?
Thank you!
Best Answer