Solved – MSE and variance reduction in regression trees

biascartmathematical-statisticsmsevariance

I've seen this statement in the SciKit documentation for Regression Trees: Supported criteria are “mse” for the mean squared error, which is equal to variance reduction as feature selection criterion

Considering that (MSE = variance + bias^2), is this claim always true? is mse reduction in regression trees always equal to variance reduction? Can we safely say that without considering bias?

UPDATE: deviation of the estimator from the true parameter is MSE, and the deviation of the estimator from its expected value is the variance. They will be the same if bias is 0. I'm just trying to see what's the math behind this claim of mse reduction is equal to variance reduction? In other words, what if selecting a feature to do the split only decreases the bias (and not the variance)?

Best Answer

Here, MSE = variance / sample size. Bias is implicitly included, and trying to separate it out wouldn't be meaningful.

Note that random forests often outperform regression with this task.

The goal of a regression tree is to generate a line that best fits the data.

MSE is the average squared difference between the actual data values and where the data point would be on the proposed line. The tree runs an algorithm that finds the line that results in the smallest MSE.

Regarding variance, this method seeks to minimize variance. Variance is defined as the sum of the squared errors (or differences between the actual data points and line selected as the best solution).

R-squared is another method that is commonly used.

Related Question