Solved – Confidence interval of first derivative of a loess smooth

confidence intervalloessr

I have plotted the evolution of parameter vs time. I would however prefer to plot the change in these parameter vs time. Here I have seen how to predict a confidence interval for a loess smooth and here how to get the first derivative of a smooth. However I do not know how to get the confidence interval of the first derivative of a loess smooth.

Q1) How can I plot the first derivative of a loess smooth with its confidence interval?

Q2) I would like to smooth mean differences between two categories. So the values on which the loess smooth is based on are themselves based on several values. Can I weigh data points when using a loess smoothing function?

Q3) Is it acceptable to plot a loess function and its CI in a scientific paper?

Thanks!

Best Answer

They are two way of doing what you describe (Q1) but unfortunately both are somewhat involved.

The first way (and probably easier) is to numerically differentiate (using for example numDeriv::grad) the relevant LOESS curve. This might be reasonably if your LOESS curve is not very variable. If you have significant jumps or dips then, as numerical differentiation is essentially a finite difference approximation, you might have really nonsensical values. You will clearly have to do this for each of the three curves (lower-, uper-CI and the curve itself).

The second way (and probably more mathematically coherent but quite involved) is to use higher order terms from your local weighted fit. This would be based on the definition of LOESS. Remember that the LOESS essentially does the following: It takes the data within a window/neighbourhood $S^*$, you weight them accordingly (usually based on tri-cube kernel for the case of standard LOESS) based on a vector $w$ and then you fit a linear regression; ie. $\hat{f}(s) = \beta_0 + w \beta_1 s $ for the points in your neighbourhood $S^* \subset S$, where $S$ is the whole support over which the data are recorded. The final LOESS estimate is then $\beta_0$. The upper- and lower-CI will be effectively the CIs for $\beta_0$. Now if you want the derivative of this data it is normal to simply use $\beta_1$. Remember that the derivative is just the slope/gradient of the function. Again for each localised linear regression you will get your relevant CIs and you will be good to go.

Notes:

I do not know a LOESS routine in R that does this natively. I co-authored a helper-function (fdapace::Lwls1D) that uses a local linear kernel smoothing for longitudinal data and allows to get the derivatives directly but it does kernel smoothing and not LOESS. You might want to check its code to get a better idea of what you need to do. I remember that locfit::locfit.raw allows some quite particular arguments too, you might want to check it too. From what I recall it was a very good piece of software so it is probably educational to give it a closer look.
For mathematical coherence, one should actually fit the quadratic model $\hat{f}(s) = \beta_0 + w \beta_1 s + w \beta_2 s^2$ and then use $\beta_1$ for the first derivative. Depending on the size of your available dataset this might give better or worse estimates. Within the context of kernel smoothing with Gaussian kernels experimentally I found the linear fit to be less variable and only marginally more biased than the quadratic fit. I do not know if the same insights apply for LOESS, so your mileage may vary on this.

For (Q2) and (Q3). Yes, of course, weighting the points used within LOESS is fine. Similarly, showing a LOESS fit and its associated CIs is perfectly reasonable.

Related Solutions

Solved – Repeated loess smoothing for time series data

Try this:

PlotData <- lapply(Countries,function(country) {
  fit1 <- loess(Ozone~DecTime ,data = NewDat[NewDat$Country==country,],na.action=na.exclude)

  fit2 <- loess(residuals(fit1)~DecTime ,data = NewDat[NewDat$Country==country,],na.action=na.exclude)

  data.frame(DecTime=NewDat[NewDat$Country==country,"DecTime"],SmoothResiduals=predict(fit2),Country=country)
  })

PlotData <- do.call('rbind',PlotData)

xyplot(SmoothResiduals~DecTime | Country, data = PlotData, type  = "l", col = 1,
       strip = function(bg = 'white',...)strip.default(bg = 'white',...))

Smoothed residuals

Explanation: lapply is used to iterate over the countries, applies a function and combines the function values in a list. The anonymous function does a loess fit (note my use of na.action=na.exclude to deal with NA values), does another loess fit on the residuals and returns a data.frame with the smoothed residuals. Subsequently, I combine all data.frames and create the plot.

Obviously you might want to adjust the parameters of the loess fits.

Solved – confidence band around a smoothed function

Earth will estimate prediction intervals for you. Do it like this:

mod = earth(y~x,data=dat, ncross=30, nfold=3, varmod.method="lm")
plotmo(mod, pt.col=1, level=.95) # show prediction intervals

summary(mod) will print this:

varmod: method "lm"    min.sd 0.124    iter.rsq 0.016

stddev of predictions:
            coefficients iter.stderr iter.stderr%
(Intercept)       1.2265      0.2059           17
y                 0.0883      0.1918          217

                          mean   smallest   largest   ratio
95% prediction interval   4.86       3.91       5.1     1.3

The above code illustrates the principle, but a problem with your small sample size is that there is not enough data to estimate prediction intervals reliably. (We know this because the iter.stderr% are very big, and the iter.rsq is very small. There is a vignette that comes with the earth package that explains in detail how to get prediction intervals, and some of the potential pitfalls.)

Best Answer

Related Solutions

Solved – Repeated loess smoothing for time series data

Solved – confidence band around a smoothed function

Related Question