Solved – Confidence interval of first derivative of a loess smooth

confidence intervalloessr

I have plotted the evolution of parameter vs time. I would however prefer to plot the change in these parameter vs time. Here I have seen how to predict a confidence interval for a loess smooth and here how to get the first derivative of a smooth. However I do not know how to get the confidence interval of the first derivative of a loess smooth.

Q1) How can I plot the first derivative of a loess smooth with its confidence interval?

Q2) I would like to smooth mean differences between two categories. So the values on which the loess smooth is based on are themselves based on several values. Can I weigh data points when using a loess smoothing function?

Q3) Is it acceptable to plot a loess function and its CI in a scientific paper?

Thanks!

Best Answer

They are two way of doing what you describe (Q1) but unfortunately both are somewhat involved.

The first way (and probably easier) is to numerically differentiate (using for example numDeriv::grad) the relevant LOESS curve. This might be reasonably if your LOESS curve is not very variable. If you have significant jumps or dips then, as numerical differentiation is essentially a finite difference approximation, you might have really nonsensical values. You will clearly have to do this for each of the three curves (lower-, uper-CI and the curve itself).

The second way (and probably more mathematically coherent but quite involved) is to use higher order terms from your local weighted fit. This would be based on the definition of LOESS. Remember that the LOESS essentially does the following: It takes the data within a window/neighbourhood $S^*$, you weight them accordingly (usually based on tri-cube kernel for the case of standard LOESS) based on a vector $w$ and then you fit a linear regression; ie. $\hat{f}(s) = \beta_0 + w \beta_1 s $ for the points in your neighbourhood $S^* \subset S$, where $S$ is the whole support over which the data are recorded. The final LOESS estimate is then $\beta_0$. The upper- and lower-CI will be effectively the CIs for $\beta_0$. Now if you want the derivative of this data it is normal to simply use $\beta_1$. Remember that the derivative is just the slope/gradient of the function. Again for each localised linear regression you will get your relevant CIs and you will be good to go.

Notes:

  1. I do not know a LOESS routine in R that does this natively. I co-authored a helper-function (fdapace::Lwls1D) that uses a local linear kernel smoothing for longitudinal data and allows to get the derivatives directly but it does kernel smoothing and not LOESS. You might want to check its code to get a better idea of what you need to do. I remember that locfit::locfit.raw allows some quite particular arguments too, you might want to check it too. From what I recall it was a very good piece of software so it is probably educational to give it a closer look.

  2. For mathematical coherence, one should actually fit the quadratic model $\hat{f}(s) = \beta_0 + w \beta_1 s + w \beta_2 s^2$ and then use $\beta_1$ for the first derivative. Depending on the size of your available dataset this might give better or worse estimates. Within the context of kernel smoothing with Gaussian kernels experimentally I found the linear fit to be less variable and only marginally more biased than the quadratic fit. I do not know if the same insights apply for LOESS, so your mileage may vary on this.

For (Q2) and (Q3). Yes, of course, weighting the points used within LOESS is fine. Similarly, showing a LOESS fit and its associated CIs is perfectly reasonable.