How to Calculate Variable Weights in Multiple Linear Regression

multiple regression

When one has built a regression, how to assess the relative contribution of each variable?

Let's say I'm working on the energy consumption of a facility. For example:

$Consumption = \beta_0+\beta_1\text{Tons_product1}+\beta_2\text{Tons_product2}+\beta_3\text{weather_conditions}$

I want to compute the relative contribution of each factor on the energetic consumption. I want it also for $\beta_0$, (which I've learnt here does not strictly represent the consumption with no production and no degree-days, but it is still meaningful, I guess).

"Relative contribution of each variable" means in my mind what variable makes the most part of the consumption in that part. The aim is to answer this question "what factor do I need to work on first to reduce my consumption, assuming I can achieve a 10% improvement in each factor?"

I can calculate the regression with all variables standardized, and take $\beta_1'$ and $\beta_2'$ as relative weight but it makes $\beta_0$ disappear, and I can't assess its importance.

I've read here that the value of t stat for each parameter is a way to assess this contribution. Is that true? Then can I use t-stat_i/sum(t-stat) to compute the relative contribution of each bi?

Or am I wrong? totally? As you see, I'm no expert in stats. And I do them in Excel using the built-in plugin.

Best Answer

Let's use the formulation of your goal from one of your comments: "What factor do I need to work on first to reduce my consumption, assuming I can achieve a 10% improvement in each factor?" (Note that this goal is different from what you would get by analyzing the standardized model, which scales factor values by the variability of each factor among your observations. Clearly stating the goal is sometimes the hardest part of this type of analysis.)

The way to reach that goal comes straight from the regression coefficients of the non-standardized model, in which all factors and Consumption are expressed in their own natural units.

For each of your factors, find the actual magnitude of a 10% improvement and multiply that amount by the corresponding regression coefficient. That will tell you how much Consumption is predicted to be affected by a 10% improvement in that factor, based on your regression.

Note, however, that these predictions depend on the quality of your regression model. If there are large standard errors in the regression coefficients, the predicted effects on Consumption will not be very precise. The simple model you present doesn't include interaction terms among the factors, but those might be very important in a realistic model. And if the true relations of factors to Consumption are nonlinear, your predictions based on the linear model might not be very good.

If you are going to continue doing these types of analyses, take some good courses on probability and statistics. That will help you formulate your problems much more precisely and quickly.