Regression – Definition and Use of Natural Cubic Splines

constraintcubicdegrees of freedomregressionsplines

I am learning about splines from the book "The Elements of
Statistical Learning Data Mining, Inference, and Prediction" by Hastie et al. I found on page 145 that Natural cubic splines are linear beyond the boundary knots. There are $K$ knots, $\xi_1, \xi_2, … \xi_K$ in the splines and the following is given about such a spline in the book.enter image description here

Question 1: How are 4 degrees of freedom freed up? I don't get this part.

Question 2: In the definition of $d_k(X)$ when $k=K$ then $d_K(X) = \frac 0 0$. What is the author trying to do in this formula? How does this help making sure that splines are linear beyond boundary knots?

Best Answer

  1. Let's start by considering ordinary cubic splines. They're cubic between every pair of knots and cubic outside the boundary knots. We start with 4df for the first cubic (left of the first boundary knot), and each knot adds one new parameter (because the continuity of cubic splines and derivatives and second derivatives adds three constraints, leaving one free parameter), making a total of $K+4$ parameters for $K$ knots.

    A natural cubic spline is linear at both ends. This constrains the cubic and quadratic parts there to 0, each reducing the df by 1. That's 2 df at each of two ends of the curve, reducing $K+4$ to $K$.

    Imagine you decide you can spend some total number of degrees of freedom ($p$, say) on your non-parametric curve estimate. Since imposing a natural spline uses 4 fewer degrees of freedom than an ordinary cubic spline (for the same number of knots), with those $p$ parameters you can have 4 more knots (and so 4 more parameters) to model the curve between the boundary knots.

  2. Note that the definition for $N_{k+2}$ is for $k=1,2,...,K-2$ (since there are $K$ basis functions in all). So the last basis function in that list, $N_{K}=d_{K-2}-d_{K-1}$. So the highest $k$ needed for definitions of $d_k$ is for $k=K-1$. (That is, we don't need to try to figure out what some $d_K$ might do, since we don't use it.)

Related Question