Solved – How to interpret this Residuals vs Fitted plot for multiple regression

multiple regressionrresiduals

I'm quite new at linear regression.
I get the following Residuals vs Fitted plot with parallel straight lines:

enter image description here

I do not understand why there are two parallel lines. Is there a problem with my data?

Thx !

Best Answer

  1. Your original data consist of a pair of parallel lines!

    Something like this:

    enter image description here

    The red line indicates the least squares linear fit for this one-predictor case.

  2. You then subtract the linear fit in red from the data laying on that pair of parallel lines to get a downsloping pair of lines in the residuals (calculating residuals from fitted is a skew transformation of the plot vs x, and making it vs fitted simply rescales the x-axis:

    enter image description here

If you have multiple predictors the plot would not look "neat" like this (with two clean lines), though. Are you certain you fitted multiple regression in your display?


A linear fit is generally not suitable for such data since the fitted line goes outside 0-1 (see where with my data the line crosses to above the data at about x=4?). More commonly a model that predicts the probability that the response is 1 would be used, such as logistic regression.

You may find the discussion of the model you fitted with the one I just mentioned at this post of some additional value.