Experimental Design: Choose Data Points to Minimize Quadratic Term Variance in Multiple Regression

experiment-designmathematical-statisticsminimum-variancemultiple regressionself-study

$\newcommand{\eps}{\varepsilon}\newcommand{\szdp}[1]{\!\left(#1\right)}
\newcommand{\szdb}[1]{\!\left[#1\right]}$

Problem Statement: Suppose that you wish to fit a model
$$Y=\beta_0+\beta_1x+\beta_2x^2+\eps$$
to a set of $n$ data points. If the $n$ points are to be allocated at the
design points $x=-1,0,1,$ what fraction should be assigned to each value of $x$
so as to minimize $V\big(\hat\beta_2\big)?$ (Assume that $n$ is large and that
$k_1, k_2,$ and $k_3,\; k_1+k_2+k_3=1,$ are the fractions of the total number
of observations to be assigned at $x=-1,0,$ and $1,$ respectively.)

Note: This is Exercise 12.35 in Mathematical Statistics with Applications, 5th Ed., by Wackerly, Mendenhall, and Scheaffer.

My Work So Far: We know from the properties of linear regression estimators that
$$V\big(\hat\beta_2\big)
=c_{22}\sigma^2,$$

where $c_{22}$ is the element of $(\mathbf{X}'\mathbf{X})^{-1}$ in row $2$ and
column $2$ (this is a zero-indexed matrix). Now if $x_i$ represents the
$i$th data point's $x$ value, then we have
$$
\mathbf{X}=
\szdb{\begin{matrix}1&x_1&x_1^2\\1&x_2&x_2^2\\ \vdots&\vdots&\vdots
\\1&x_n&x_n^2\end{matrix}},
$$

so that $\mathbf{X}'\mathbf{X}$ is
\begin{align*}
\mathbf{X}'\mathbf{X}
&=\szdb{\begin{matrix}1&1&\cdots&1\\x_1 &x_2 &\cdots &x_n\\
x_1^2 &x_2^2 &\cdots &x_n^2\end{matrix}}
\szdb{\begin{matrix}1&x_1&x_1^2\\1&x_2&x_2^2\\ \vdots&\vdots&\vdots
\\1&x_n&x_n^2\end{matrix}}\\
&=n\szdb{\begin{matrix}
\mu_0'&\mu_1'&\mu_2'\\
\mu_1'&\mu_2'&\mu_3'\\
\mu_2'&\mu_3'&\mu_4'\\
\end{matrix}},
\end{align*}

where $\mu_k':=\frac1n\sum_{i=1}^nx_i^k.$
A simplification is that on the dataset $x_i=\{-1,0,1\},$ we have
$\mu_1'=\mu_3',$ and $\mu_2'=\mu_4',$ as well as $\mu_0'=1,$
so that the matrix becomes
\begin{align*}
\mathbf{X}'\mathbf{X}
&=n\szdb{\begin{matrix}
1&\mu_1'&\mu_2'\\
\mu_1'&\mu_2'&\mu_1'\\
\mu_2'&\mu_1'&\mu_2'\\
\end{matrix}}.
\end{align*}

Using these Mathematica commands:

m={{1,m1,m2},{m1,m2,m1},{m2,m1,m2}}
c[m1_,m2_]=Inverse[m][[3]][[3]]//FullSimplify

we find that the correct element of the inverse is therefore
$$c_{22}=\frac{\mu_1'^2-\mu_2'}{n(\mu_2'-1)(\mu_2'-\mu_1')(\mu_2'+\mu_1')},$$
making
$$V\big(\hat\beta_2\big)
=\frac{\mu_1'^2-\mu_2'}{(\mu_2'-1)(\mu_2'-\mu_1')(\mu_2'+\mu_1')}\cdot
\frac{\sigma^2}{n}.$$

My Questions: This variance doesn't seem right, because it looks to me as though it could be negative. Have I made a mistake somewhere? And if I have not made a mistake somewhere, how would I go about minimizing this expression subject to $x_i\in\{-1,0,1\}?$ I tried differentiating w.r.t. $\mu_1'$ and $\mu_2',$ and setting the results equal to zero, but I'm not sure if that is the correct procedure, since I'm not sure those variables are independent and able to be controlled in that fashion.

Best Answer

First, due to the conditions on $k$ values, we know that there are $nk_1$ points at $-1$, $nk_2$ points at $0$ and $nk_3$ points at $1$. For each such point, there's a respective row in $X$: Either $(1,-1,1)$, $(1,0,0)$ or $(1,1,1)$. We therefore get $$X^TX= \begin{pmatrix} n & n(k_3-k_1) & n(k_3+k_1)\\ n(k_3-k_1) & n(k_3+k_1) & n(k_3-k_1)\\ n(k_3+k_1) & n(k_3-k_1) & n(k_3+k_1) \end{pmatrix}. $$ We want to minimize $V(\hat{\beta}_2)$ which is $\sigma^2\cdot\left((X^TX)^{-1}\right)_{3,3}$. Using SAGEMATH (or manually, which you should be able to do) we get: $$V(\hat{\beta}_2)=\frac{\sigma^2}{n}\left( \frac{k_1 + k_3 - k_1^2 + 2k_1k_3-k_3^2}{4k_1 k_2 k_3} \right)$$ So we need to minimize this unwelcoming fraction. Note that we can't take them to be zero because (a) it's not the purpose of this question and (b) they're multiplicative in the denominator. so we need to differentiate w.r.t $k_1,k_3$ and set each of these to be equal 0. $$\frac{\partial}{\partial k_1}f=\frac{(1-2k_1+2k_3)(4k_1k_2k_3)-4k_2k_3(k_1 + k_3 - k_1^2 + 2k_1k_3-k_3^2)}{16k_1^2k_2^2k_3^2}=0$$ I'll spare the unsympathetic math here (though you should do it), we eventually get: $$4k_1^2=(1-k_1-k_3)^2 \\ 4k_3^2=(1-k_1-k_3)^2$$ As all $k$ values are positive, we get the (quite anticipated) solution $k_1=k_3$ and then $2k_1=1-2k_1$ so $k_1=k_3=\frac{1}{4}$ and $k_2=\frac{1}{2}$.