Regression – Detecting Patterns in Residual Plots for Regression Analysis

regressionresiduals

I wish to automatically (not by visual inspection) detect where large deviations occur in a residual plot from a regression. For example, suppose I have the residual plot below:

enter image description here

I want to automatically detect the observations from about 30:35 deviate from a normal residual pattern. Some clues are that the magnitude is quite large and the residuals do not appear independent in this region. How can I go about this?

Best Answer

A dependent mixture model (hidden Markov model) may be of use, depending on the type of deviations expected.

Assume that your observations come from two distributions (or states), both of which are normally distributed, but have different mean and variance.

A number of parameters can be estimated: The initial state probabilities (2 parameters), the state transition probabilities between neighbouring data points (4 parameters) and finally the mean and variance of the two distributions (4 parameters).

In R, this model can be estimated using the depmixS4 package:

library(depmixS4)

set.seed(3)
y = rnorm(100)
y[30:35] <- rnorm(6,mean=4,sd=2)
plot(1:100,y,"l")

m <- depmix(y~1,nstates=2,ntimes=100)
fm <- fit(m)

means <- getpars(fm)[c(7,9)]
lines(1:100,means[fm@posterior$state],lwd=2,col=2)

enter image description here

See http://cran.r-project.org/web/packages/depmixS4/vignettes/depmixS4.pdf for references

Related Question