[Math] How to work out orthogonal polynomials for regression model

orthogonal-polynomialsregression

I put this question here as it has a pure maths element to it, even though it has a statistical twist. Basically, I have been given the table of data:

$$\begin{matrix} i & \mathrm{Response} \, y_i & \mathrm{Covariate} \, x_i \\ 1 & -1 & -0.5 \\ 2 & -0.25 &-0.25 \\ 3 & 1 & 0 \\ 4 & 1.2 & 0.25 \\ 5 & 2.6 & 0.5 \end{matrix}$$

And for here, I need to create an orthogonal regression model. In other words, I need to come up with a model in the form

$$Y_i = \alpha_0 + \alpha_1(ax_i + b) + \alpha_2(dx_i^2 + ex_i + f)$$

Where

$$\sum_i^n ax_i + b = dx_i^2 + ex_i + f = 0$$

and $\alpha_{0,1,2}$ are just constants.

From looking at it, I can see that the linear term can just be $x_i$ as $\sum_i^n x_i = 0$, but I'm stuck on how to solve for the quadratic bit.

How can I do this?

Best Answer

It looks as if you're using the (discrete) inner product

$$\langle f,g\rangle=\sum_j^n f(x_j) g(x_j)$$

and I'll proceed with that assumption.

The canonical method for producing a basis of orthogonal polynomials with respect to some (discrete or continuous) inner product is the Stieltjes procedure. I discussed the continuous case here; the discrete case proceeds similarly.

In any event, to summarize: if you have a set of monic orthogonal polynomials $\phi_k(x)$ satisfying the recurrence

$$\phi_{k+1}(x)=(x+b_k)\phi_k(x)+c_k\phi_{k-1}(x)$$

with the associated orthogonality condition $\langle \phi_j,\phi_k\rangle=0,\quad j\neq k$ and the initial values $\phi_{-1}(x)=0$ and $\phi_0(x)=1$, the recurrence coefficients $b_k$ and $c_k$ are obtained through the formulae

$$\begin{align*}b_k&=-\frac{\langle x\phi_k ,\phi_k\rangle}{\langle \phi_k,\phi_k\rangle}\\c_k&=-\frac{\langle\phi_k,\phi_k\rangle}{\langle\phi_{k-1},\phi_{k-1}\rangle}\end{align*}$$

where $x\phi_k$ denotes the product of $x$ and $\phi_k(x)$.

For this particular case, as you say, $\phi_1(x)=x$ is easily determined; thus, to compute $\phi_2(x)$, we compute $b_1=0$ and $c_1=-\frac18$ to yield $\phi_2(x)=x^2-\frac18$.

We can then determine that the quadratic least-squares fit to the data is (approximately) given by $0.71+3.46\phi_1(x)+0.285714\phi_2(x)$.

If you want more details on data-fitting with orthogonal polynomials, see this article by George Forsythe. You might also want to look up the Gram polynomials, which are the standard orthogonal polynomials used for data fitting when the data has equispaced abscissas.

Related Question