Solved – Linear regression on set of points with two lines

linear modelregression

I have a set of points, in 2D space, where there are two tight (with minimal scatter) lines, with different slope and offset. There are also randomly scattered points that do not fall onto either line.

It appears that simple linear regression works with only one line. Another thought that I had was to do a Hough transform on the points if there is not a linear regression technique to handle two lines within the same set of points.

Best Answer

You can do regression on two lines if you know which line should apply to each observations.

$$ y_i = \beta_0 + \beta_1G_i + \beta_2 x_i $$

This fits two parallel lines, where $G_i$ is an indicator of group membership. If you don't want parallelism, you add another term:

$$ y_i = \beta_0 + \beta_1G_i + (\beta_2 + \beta_3 G_i) x_i $$

If you don't know group membership, you could try an iterative process whereby you estimate both the lines and the group membership. I have never done that, but it should work if you really have two groups with distinct lines.