Solved – Random Forest for continuous response variable

random forest

I am running randomForest on a dataset which has a continuous dependent variable. Is there a way to get coefficients of the predictors as in linear regression?

Best Answer

No. The coefficients in a linear regression model are intimately tied to the specification of the model structure. That is, you postulate a relationship of the form:

$$ E(y \mid x) = \beta \cdot x + \epsilon $$

and find the coefficients $\beta$ that best fits this postulated relationship to your observed data.

In the absence of such a structured postulated form of the $y$ to $x$ relationship, no terse yet complete summary of the model is possible.

Generally, models that can be completely summarized with a finite, small vector in $R^n$ are referred to as parametric, and those that cannot are dubbed non-parametric. Random forest is a non-parametric model.

The distinction between parametric and non-parametric is lose. With a sufficiently clever encoding scheme you could, theoretically at least, express a random forest as a single vector, but the number of components would number in the ten thousands for a reasonably sized forest. Much too great a size to be meaningful to any human interpreter.

Related Question