Multiple Regression – Understanding Range Restriction in Predicted Values

multiple regressionpredictive-modelspsychometricsregression

I have a built linear regression model whereby I have used scores on a personality survey to predict manager ratings of job performance. Both my predictor and outcome variables have a min and max score of 0 and 100, respectively. The linear regression model performs well (i.e. high Adjusted R-Squared, predictors and model is statistically significant, etc.).

When I use the regression coefficients to estimate new scores, the min and max of the predicted scores usually range between 40 and 60.

An example model:

50.9730 + (Conscientiousness * 0.1790) + (Stability * -0.1149)

Given that the model was developed using predictors and an outcome variable that have a min/max of 0 and 100, why do predicted values have such range restriction?

I was expecting the predicted values to also have a similar min/max of 0 and 100.

Best Answer

This could be explained by a couple of things. First, note that if all predictors had a coefficient of 0, then the model would return the mean of outcome as its prediction. So, if the coefficients are very or there is a weak relationship between the predictor and the outcome (which is different than being statistically significant), then the model might not predict a very wide range of outcomes.

Second, the correlation between variables might explain result in some effects canceling one another. If Conscientiousness and Stability are correlated, then seeing Conscientiousness go from 0 to 100 is not going to change the outcome by a full 18 units due to Stability also increasing and curbing the change in the outcome.

You also have to take into account the prevalence of managers rated 100 in all categories, etc. However, this is all speculative unless we see some data. Can you share it with us?

Related Question