What you are referring to as a confidence score can be obtained from the uncertainty in individual predictions (e.g. by taking the inverse of it).
Quantifying this uncertainty was always possible with bagging and is relatively straightforward in random forests - but these estimates were biased. Wager et al. (2014) described two procedures to get at these uncertainties more efficiently and with less bias. This was based on bias-corrected versions of the jackknife-after-bootstrap and the infinitesimal jackknife. You can find implementations in the R packages ranger
and grf
.
More recently, this has been improved upon by using random forests built with conditional inference trees. Based on simulation studies (Brokamp et al. 2018), the infinitesimal jackknife estimator appears to more accurately estimate the error in predictions when conditional inference trees are used to build the random forests. This is implemented in the package RFinfer
.
Wager, S., Hastie, T., & Efron, B. (2014). Confidence intervals for random forests: The jackknife and the infinitesimal jackknife. The Journal of Machine Learning Research, 15(1), 1625-1651.
Brokamp, C., Rao, M. B., Ryan, P., & Jandarov, R. (2017). A comparison of resampling and recursive partitioning methods in random forest for estimating the asymptotic variance using the infinitesimal jackknife. Stat, 6(1), 360-372.
Best Answer
Here's an example of a multi-output regression problem undertaken with facial recognition. It includes a coding sample as well, it should give you a start with your methodology. http://scikit-learn.org/stable/auto_examples/plot_multioutput_face_completion.html