I have two predictors x1 and x2 and the relationship between x1 and y is quadratic. Therefore I transformed the x1 by squaring it then added another interaction term to meet the assumptions of the linear regression model. The final regression is: y = β0+β1x1x2+β2×1^2+β3×2 Below is the scatter plot between x1 and y and the transformation that I have done
After the transformation and adding an interaction term, the final model looks like this.
Call:
lm(formula = y ~ interaction + x1sq + x2, data = df)
Residuals:
Min 1Q Median 3Q Max
-0.61828 -0.13661 0.00163 0.13741 0.67368
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.5056567 0.0148510 34.05 <2e-16 ***
interaction -1.0011209 0.0007353 -1361.44 <2e-16 ***
x1sq 1.9977889 0.0011077 1803.59 <2e-16 ***
x2 0.5004741 0.0031027 161.30 <2e-16 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 0.2003 on 996 degrees of freedom
Multiple R-squared: 0.9998, Adjusted R-squared: 0.9998
F-statistic: 1.642e+06 on 3 and 996 DF, p-value: < 2.2e-16
(Intercept) interaction x1sq x2
(Intercept) 2.205524e-04 2.786894e-07 -5.829721e-06 -3.937890e-05
interaction 2.786894e-07 5.407276e-07 -3.093296e-08 -4.557951e-08
x1sq -5.829721e-06 -3.093296e-08 1.226938e-06 2.368341e-07
x2 -3.937890e-05 -4.557951e-08 2.368341e-07 9.626868e-06
I do not wish to abandon the linear regression model and I want to interpret the model hyper-parameters. Is there anything that I can do to achieve this?
Best Answer
What you have is almost exactly:
$$ y = 0.5+ 2 x_1^2 + 0.5 x_2 - x_1 x_2.$$
You ned to apply your understanding of the subject matter to interpret the coefficients.* With such simple coefficients and small standard errors relative to the scale of your $y$ values, I suspect that there is some theoretical relationship underlying your model's results.
Try rearranging or combining the terms in the above equation in a way that might make sense for your subject matter. Without knowing more about your subject matter, it's hard to provide more precise advice.
*Technically these aren't called "hyperparameters". From Wikipedia: "In machine learning, a hyperparameter is a parameter whose value is used to control the learning process." (Emphasis added.) The coefficient estimates in the model are results of the the learning/modeling process.