Solved – Identifying outliers for non linear regression

nonlinear regressionoutliersr

I am doing research on the field of functional response of mites.
I would like to do a regression to estimate the parameters (attack rate and handling time) of the Rogers type II function.
I have a dataset of measurements.
How can I can best determine outliers?

For my regression I use the following script in R (a non linear regression):
(the dateset is a simple 2 column text file called data.txt file with N0 values (number of initial prey) and FR values (number of eaten prey during 24 hours):

library("nlstools")
dat <- read.delim("C:/data.txt")    
#Rogers type II model
a <- c(0,50)
b <- c(0,40)
plot(FR~N0,main="Rogers II normaal",xlim=a,ylim=b,xlab="N0",ylab="FR")
rogers.predII <- function(N0,a,h,T) {N0 - lambertW(a*h*N0*exp(-a*(T-h*N0)))/(a*h)}
params1 <- list(attackR3_N=0.04,Th3_N=1.46)
RogersII_N <-  nls(FR~rogers.predII(N0,attackR3_N,Th3_N,T=24),start=params1,data=dat,control=list(maxiter=    10000))
hatRIIN <- predict(RogersII_N)
lines(spline(N0,hatRIIN))
summary(RogersII_N)$parameters

For plotting the calssic residuals graphs I use following script:

res <- nlsResiduals (RogersII_N)
plot (res, type = 0)
hist (res$resi1,main="histogram residuals")
    qqnorm (res$resi1,main="QQ residuals")
hist (res$resi2,main="histogram normalised residuals")
    qqnorm (res$resi2,main="QQ normalised residuals")
par(mfrow=c(1,1))
boxplot (res$resi1,main="boxplot residuals")
    boxplot (res$resi2,main="boxplot normalised residuals")

Questions

  • How can I best determine which data points are outliers?
  • Are there tests I can use in R which are objective and show me which data points are outliers?

Best Answer

Several tests for outliers, including Dixon's and Grubb's, are available in the outliers package in R. For a list of the tests, see the documentation for the package. References describing the tests are given on the help pages for the corresponding functions.

In case you were planning to remove the outliers from your data, bear in mind that this isn't always advisable. See for instance this question for a discussion on this (as well as some more suggestions on how to detect outliers).

Related Question