Not really a full answer, but too long for a comment: s
sets up a spline, whereas loess
does a local regression.
In the gam
package (maybe mgcv
too, not too familiar with that one) you can also feed a local regression, as in
library(gam)
set.seed(1234)
# generate data
x <- sort(runif(100))
y <- sin(2*pi*x) + rnorm(10, sd=0.1)
gam.1 <- gam(y ~ lo(x))
base.r <- loess(y ~ x)
summary(base.r$fitted - gam.1$fitted)
plot(base.r$fitted,gam.1$fitted)
That does not produce the same fitted values either, but maybe you can further play around with the settings of lo
and loess
.
[In the later discussion of LOESS here I attempt to describe LOWESS and its implementation in the R function lowess
as well as outline some of the modifications made for the function loess
(though some details that don't seem to be directly relevant to your questions are omitted).]
In particular: with smoothing splines, how do we choose the number and location of breakpoints
You don't; there's one at every data point; the smoothing parameter is the source of all the regularization. If you want fewer knots, you're talking about penalized splines.
as well as the polynomial degree of the spline?
With smooth.spline
it's always cubic, it says so right in the help.
If you mean the degree of local fit in LOESS (which is not a spline) first see Cleveland [1] (which describes LOWESS, on which LOESS is based) -- it pretty much suggests that 0 isn't flexible enough ("*in the practical situation, an assumption of local linearity serves better than local constancy") and 2 is harder to compute, relative to a smaller gain in flexibility, and it suggests choosing the degree to be 1 as the best compromise in practice.
The suggestions in Cleveland [1] (more details about choosing the various parameters are given in the paper) are the defaults in the R function lowess
(such as degree 1 and span 2/3).
The help on 'loess' says it uses different defaults (degree is 2 and span is 3/4).
And what do the bandwidth arguments control in the function?
As it is described by Bill Cleveland[1], LOESS applies a tricube weight function ($W(x)=((1-|x|^3)_+)^3$) to locally weight the points.
$W$ is scaled so that the $r$-th nearest neighbor is the first to get zero weight, where $r = \text{round}(fn)$ and $f$ is the span argument. If there are multiple predictors this is modified (see the help on loess
).
The loess
function allows you to specify a target number-of-parameters equivalent instead of the span.
Also how does LOESS select outliers for removal?
Again, as described by in Cleveland[1], LOWESS downweights observations with large residuals rather than specifically select and remove them. However, some observations may get zero weight, which means some are effectively removed. Specifically, after an initial fit LOWESS introduces robustness weights based on the residuals from the initial fit. The robustness weights use a biweight function ($B(x)=((1-x^2)_+)^2$); any observation with an absolute residual more than six times the median absolute residual will have zero weight, but points closer than that will still have reduced weight; for example, a point with absolute residual 3.25 times the median absolute residual will have about half weight.
This downweighting process is iterated (that is, residuals are recalculated from a fit using these weights, and the robustness weights recalculated in turn, until convergence). Note that both $W$ and $B$ can downweight a given observation.
The help for the implementation of loess
refers to redescending M-estimation using a biweight function, but that is presumably just being used as a brief way of describing the above scheme rather than doing anything different.
[1] Cleveland, William S. (1979).
"Robust Locally Weighted Regression and Smoothing Scatterplots".
Journal of the American Statistical Association. 74 (368): 829–836.
Best Answer
Here is some R code/example that will let you compare the fits for a loess fit and a spline fit:
You can try it with your data and change the code to try other types or options. You may also want to look at the
loess.demo
function in the TeachingDemos package for a better understanding of what the loess algorythm does. Note that what you see from loess is often a combination of loess with a second interpolation smoothing (sometimes itself a spline), theloess.demo
function actually shows both the smoothed and the raw loess fit.Theoretically you can always find a spline that approximates another continuous function as close as you want, but it is unlikely that there will be a simple choice of knots that will reliably give a close approximation to a loess fit for any data set.