[Math] Strong vs weak relationship in this correlation

correlationregressionregression analysisstatistics

enter image description here

I produced this plot and regression line in R and I thought my results were quite odd. Is the relationship of the correlation determined by how steep the regression line is? So in this case it isn't very steep, so am I fair to assume it's a weak relationship? I also wondered about my regression line, since a lot of the data is way below the line, could my regression line be incorrect?

Best Answer

Yes, something is off here. A regression line always passes through the middle of your data (the average of your x and the average of your y form a point on the line), so your line seems too high compared with the rest of your data.

The correlation is proportional to the slope of your regression line (m = r*sy/sx where sy and sx are your standard deviations for y and x, respectively), but you can't tell correlation just by looking at the line. Consider the data (1, 0), (2, 0), (3, 0), ... The best fit line will be y=0, which is perfectly horizontal (the slope is 0) yet has a perfect correlation (r=1).

I would run your regression again; make sure you are including all points. If you have a number of repeated points above the line it's possible that this is correct, but I doubt that's the case.

Related Question