Solved – Invertibility in Reinsch form Derivation (Smoothing Splines)

machine learningsmoothingsplines

In Element of Statistical Learning II, in the context of smoothong splines, $\pmb{S_{\lambda}}$ is defined as
$$
\pmb{S_{\lambda}} = \pmb{N}(\pmb{N}^T\pmb{N} + \lambda \pmb{\Omega_N})^{-1}\pmb{N}^T
$$
$\pmb{N}$ is a $N$x$N$ square matrix defined as $\pmb{N}_{i,j} = N_j(x_i)$ where the $N_j$ are defined as
$$\left\{\begin{array}{ll}
N_1(X) = 1\\
N_2(X) = X \\
N_{j+2}(X) = d_j(X)-d_{N-1}(X)
\end{array}
\right.
$$

where the knots $\xi_{j}$ are the $x_j$ and

$$
d_j(X)=\frac{(X-\xi_{j})_+^3-(X-\xi_{N})_+^3}{\xi_{N}-\xi_{j}}
$$

Assuming that $\pmb{N}$ is invertible the Reinsch form can be obtained as
\begin{align}
\begin{aligned}
\pmb{S_{\lambda}}
& = \pmb{N}(\pmb{N}^T\pmb{N} + \lambda \pmb{\Omega_N})^{-1}\pmb{N}^T \\
& = \pmb{N}(\pmb{N}^T\pmb{N} + \lambda\pmb{N}^T\pmb{N}^{-T}\pmb{\Omega_N} \pmb{N}^{-1}\pmb{N})^{-1}\pmb{N}^T \\
& = \pmb{N}[\pmb{N}^T(\pmb{I} + \lambda \pmb{N}^{-T} \pmb{\Omega_N} \pmb{N}^{-1})\pmb{N}]^{-1}\pmb{N}^T \\
& = \pmb{N}\pmb{N}^{-1}(\pmb{I} + \lambda \pmb{N}^{-T} \pmb{\Omega_N} \pmb{N}^{-1})^{-1}\pmb{N}^{-T}\pmb{N}^T \\
& = (\pmb{I} + \lambda \pmb{K})^{-1}
\end{aligned}
\end{align}

where $\pmb{K} = \pmb{N}^{-T} \pmb{\Omega_N} \pmb{N}^{-1}$.

My question is:
How to show that $\pmb{N}$ is invertible?

Best Answer

I was hoping to find a more intuitive proof, but the matrix can be shown to be invertible by directly showing that its determinant is nonzero.

First note that the knots are located at the $K$ unique values of $\mathbf{x}$. Let $\boldsymbol{\xi}$ represent these knots, in ascending order.

Also, $d_j(\xi_i) = 0$ if $i \le j$, and so $N_{i,j} = 0$ if $i > j + 1$ and so $\mathbf{N}$ is "almost lower diagonal". In fact, it's a lower diagonal matrix with a column of 1's appended to the front of it.

We can take advantage of this structure to obtain a simple expression for the determinant. First we transform $\mathbf{N}$ by subtracting the first row from all others. Call this matrix $\mathbf{M}$. (These row operations do not affect the determinant, so $\text{det}(\mathbf{M}) = \text{det}(\mathbf{N})$.)

We have: $$ \begin{aligned} \mathbf{M} &= \begin{pmatrix} 1 & \xi_1 & 0 & 0 & \cdots & 0 \\ 0 & \xi_2 - \xi_1 & 0 & 0 & \cdots & 0 \\ 0 & \xi_3 - \xi_1 & N_3(\xi_3) & 0 & \cdots & 0 \\ 0 & \xi_4 - \xi_1 & N_3(\xi_4) & N_4(\xi_4) & \cdots & 0 \\ \vdots & \vdots & \vdots & \vdots & \ddots & \vdots \\ 0 & \xi_K - \xi_1 & N_3(\xi_K) & N_4(\xi_K) & \cdots & N_K(\xi_K) \end{pmatrix} \\ \end{aligned} $$

By expanding along the first column, we see that the determinant of $\mathbf{M}$ is equal to that of the sub-matrix formed by deleting the first column and the first row. Because this sub-matrix is lower diagonal, the determinant is the product of its diagonal entries:

$$ \text{det}(\mathbf{M}) = (\xi_2 - \xi_1) \prod_{i=3}^K N_i(\xi_i) $$

We know that $\xi_2 - \xi_1 \ne 0$. Thus it suffices to show that $N_i(\xi_i) \ne 0$ for all $i \in \{3, \dots, K\}$, as follows:

\begin{aligned} N_i(\xi_i) &= d_{i-2}(\xi_i) - d_{K-1}(\xi_i) \\ &= \frac{(\xi_i - \xi_{i-2})_+^3 - (\xi_i-\xi_{K})_+^3}{\xi_{K}-\xi_{i-2}} - \frac{(\xi_i-\xi_{K-1})_+^3 - (\xi_i-\xi_{K})_+^3}{\xi_{K}-\xi_{K-1}} \\ &= \frac{(\xi_i - \xi_{i-2})_+^3}{\xi_{K}-\xi_{i-2}} - \frac{(\xi_i-\xi_{K-1})_+^3}{\xi_{K}-\xi_{K-1}} \\ \end{aligned}

If $i < K$, then we have $$ N_i(x_i) = \frac{(\xi_i - \xi_{i-2})_+^3}{\xi_{K}-\xi_{i-2}} \ne 0 $$ by the fact that the $\xi_i$ are strictly increasing. If $i = K$, then we have \begin{aligned} N_K(x_K) &= (\xi_K - \xi_{K-2})^2 - (\xi_K - \xi_{K-1})^2 \\ &> (\xi_K - \xi_{K-2})^2 - (\xi_K - \xi_{K-2})^2 = 0 \end{aligned}

So $N_K(x_K) \ne 0$ and the proof is complete.

Related Question