Regression – Difference Between Least Squares Line and Regression Line Explained

correlationdistributionsleast squaresregressionterminology

It is common to plot the line of best fit on a scatter plot when there is a linear association between two variables. One method of doing this is with the line of best fit found using the least-squares method. Another method would be to use a regression line that, which can be written as (y-mean(y))/SD(y) = r*(x-mean(x))/SD(x). What is the difference between these two models? I don't understand when to use one over the other. We also learned that the regression line always passes through the averages of the conditional y distributions of the data is football-shaped when plotted. Is this also the case for the least-squares line?

Best Answer

Linear regression ends up being a lot more than this, but when you plot a “trend line” in Excel or do either of the methods you’ve mentioned, they’re all the same. The formula you give is a simple way of finding the regression equation that works in the particular case that you’re considering where there’s only one predictor variable. We use matrix algebra when the regression is more complicated with multiple predictors, and what you have is a special case of that full model.

I do not follow what you’ve written about a football, but the least squares regression line passes through $(\bar{x},\bar{y})$, yes.

Related Question