R-Squared and Regression – Interpreting Simple Linear Regression Output

r-squaredregression

I have run a simple linear regression of the natural log of 2 variables to determine if they correlate. My output is this:

R^2 = 0.0893

slope = 0.851

p < 0.001

I am confused. Looking at the $R^2$ value, I would say that the two variables are not correlated, since it is so close to $0$. However, the slope of the regression line is almost $1$ (despite looking as though it's almost horizontal in the plot), and the p-value indicates that the regression is highly significant.

Does this mean that the two variables are highly correlated? If so, what does the $R^2$ value indicate?

I should add that the Durbin-Watson statistic was tested in my software, and did not reject the null hypothesis (it equalled $1.357$). I thought that this tested for independence between the $2$ variables. In this case, I would expect the variables to be dependent, since they are $2$ measurements of an individual bird. I'm doing this regression as part of a published method to determine body condition of an individual, so I assumed that using a regression in this way made sense. However, given these outputs, I'm thinking that maybe for these birds, this method isn't suitable. Does this seem a reasonable conclusion?

Best Answer

The estimated value of the slope does not, by itself, tell you the strength of the relationship. The strength of the relationship depends on the size of the error variance, and the range of the predictor. Also, a significant $p$-value doesn't tell you necessarily that there is a strong relationship; the $p$-value is simply testing whether the slope is exactly 0. For a sufficiently large sample size, even small departures from that hypothesis (e.g. ones not of practical importance) will yield a significant $p$-value.

Of the three quantities you presented, $R^2$, the coefficient of determination, gives the greatest indication of the strength of the relationship. In your case, $R^{2} = .089$, means that $8.9\%$ of the variation in your response variable can be explained a linear relationship with the predictor. What constitutes a "large" $R^2$ is discipline dependent. For example, in social sciences $R^2 = .2$ might be "large" but in controlled environments like a factory setting, $R^2 > .9$ may be required to say there is a "strong" relationship. In most situations $.089$ is a very small $R^2$, so your conclusion that there is a weak linear relationship is probably reasonable.