Solved – How to explain the “line of best fit” in this diagram

regressionteaching

I teach an intro statistics class at my university (as a graduate student) and I was scouring the internet for interesting graphs on the history of linear regression when I came upon this picture, presumably from a paper that Pearson once wrote:

Pearson's graph

I was brainstorming ways to explain to my class why linear regression is called "regression" (for "regression to the mean", as I understand it) and this chart caught my eye, but upon further review I became confused as to what it means.

The website where I got this picture claims that the significance of this chart is that it shows that what makes a line a "regression" line is that its slope is less than 1. OK, I understand that rationale well enough.

But… then what explains the "line of best fit" drawn on the graph? To my eye, it seems like the regression line is the classic least-squares line, and the "line of best fit" is just the line which minimizes the ordinary Euclidean distance to the line from the points (i.e. perpendicular distance to the points).

If that's the difference between the two lines, then I get it. But then is there anything more to read into the difference between the two lines with regards to "regression to the mean?" For example, is there a good intuitive way to explain why using the squared residuals (instead of the Euclidean distance) gives a line which is shallower in slope in this case?

Best Answer

That image was taken from Karl Pearson's 1901 article "On Lines and Planes of Closest Fit to Systems of Points in Space" (p 566) which is the paper that introduced the concept of principal component analysis.

The introduction of the paper implies that regression is applicable to the case where the independent variables are exactly known. In this paper, Pearson is exploring the case of finding a line of best fit when the independent variables are corrupted by some error. He goes on to define the best-fit line as follows:

The best-fitting straight line for the system of points coincides in direction with the major axis of the correlation ellipse... . Physically the axes of the correlation-type ellipse are the directions of independent or uncorrelated variation. Hence the line of best fit is a direction of uncorrelated variation.

So the line of best fit in the figure corresponds to the direction of maximum uncorrelated variation, which is not necessarily the same as the regression line. You are correct: it is the line which minimizes the sum of the squares of the perpendicular distance between each point and the line.

I think the figure is just intended for visualization of this point. There may be no numerical reason that the regression line has a smaller slope than the best-fit line for this case. Section 7 of Pearson's paper gives some numerical examples which may be useful to you.