Solved – Smoothing spline

splines

What effect would it have on a smoothing spline to use the third (or fourth) derivative for the penalty term? Specifically, what would be the effect on the RSS if the tuning parameter were to be varied from 0 to infinity?
$$
RSS=∑(y_i−f(x_i))^2 + λ∫((f(t)′′)^2dt
$$

Best Answer

Let's say that we penalize the 4th derivative of $f$. This means that if $f$ is of the form $f(x) = ax^3 + bx^2 + cx + d$ then we'll have a penalty of 0. When we penalize the 2nd derivative then as $\lambda \rightarrow \infty$ we are left with the linear model that minimizes the RSS. Now as $\lambda \rightarrow \infty$ we can fit up to a cubic regression with no penalty. Certainly a cubic regression will do better than a linear model with respect to in-sample RSS so that's what we'll get. But the whole point of splines is that we avoid all of the issues that come with fitting a global polynomial (like crazy behavior near the edges of our space). So it seems to me that penalizing a higher order derivative would hamstring splines by making them at least as flexible as a polynomial regression, and by forcing a global polynomial upon us.

As $\lambda \rightarrow 0$ I don't think it matters what we penalize since there won't be a penalty either way. There is only a difference for large $\lambda$.

Related Question