Solved – Residuals analysis: interpretation of a scatter plot

least squaresmultiple regressionrregressionresiduals

I have problems with the interpretation of a scatter plot in a multiple linear regression (OLS method). I have posted an image below of the scatter plot of the standardized residuals vs the predicted value of my dependent variable (C).

My question is: in this graph can I assume that my linear regression model is good? The disposition of the residuals is suspect and this often is trace of nonlinear relationship between the variables. What do you think of my case? Thanks to everyone.enter image description here

EDIT: here is the dataset!ehtEERJA!_3OMnu2GutFmM9R9fZjfQIthF7bzCNMaT_g1Q2033ko

C is the dependent variable, the other variables are the indipendent.
Durbin Watson stat: 1,241603582
Shapiro-Wilk test shows that residuals are normally distributed

EDIT 2: here is the qq plot for residuals
enter image description here

Best Answer

No, this does not look good. You appear to have a problem with heteroscedasticity as there is increasing variance of residuals with increasing predicted values. Constant variance is an important condition for OLS regression in order to perform valid inference. This might be resolved by log-transforming the response variable.

There is also a hint of autocorrelation but this is hard to assess with so few data points.

Edit, after downloading the data:

Log-transforming C helps with heteroscedasticity, though there are few data points so I would advise some caution: while it seems to help with these data, it may not be the case with more observations. There could be other non-linearities that should be accounted for.

enter image description here

However, all your independent variables are highly correlated with each other, which is not good at all for model interpretation:

      years    Y    W  SSW    G    T   TR    D
years  1.00 0.95 0.96 0.96 0.98 0.98 1.00 0.98
Y      0.95 1.00 0.99 0.95 0.97 0.98 0.95 0.87
W      0.96 0.99 1.00 0.97 0.98 0.98 0.96 0.89
SSW    0.96 0.95 0.97 1.00 0.98 0.97 0.97 0.93
G      0.98 0.97 0.98 0.98 1.00 0.99 0.99 0.95
T      0.98 0.98 0.98 0.97 0.99 1.00 0.98 0.93
TR     1.00 0.95 0.96 0.97 0.99 0.98 1.00 0.98
D      0.98 0.87 0.89 0.93 0.95 0.93 0.98 1.00