Solved – Confidence interval for boosted decision trees

confidence intervalquantile regressionr

I have been using Azure studio for predictions and I ended up using Boosted Decision Trees as it gave me the best results. However, I need to put some confidence on my predictions for out of sample data. I could not find any method so far to do that. One option was to use a different model ("Forest quantile regression") which gave me percentiles but if I want to have 95% and 4% percentiles, it will give me a very wide range which is not acceptable. I wonder if there is any alternative for these.

Best Answer

Hold out a test set and check the performance of the model on the test set.

Spoiler alert: RF often has significantly worse performance on the test set. You can mitigate this by limiting the tree complexity. If is best to use a validation set to choose the parameters like leaf size and number of splits, then finally use a test set to report the final performance.

Related Question