Cramer-Rao bound for LS estimator

statistics

It's a problem from Machine Learning: A Bayesian And Optimization Perspective (problem 3.7):
Derive the Cramer-Rao bound for the LS estimator, when the training data result from the linear model $$y_n=\theta x_n+\eta_n, n=1, 2, …, N$$ where $x_n$ and $\eta_n$ are i.i.d sample of a zero mean random variable, with variance of $\sigma^2_x$ and a Gaussian one with zero mean and variance of $\sigma^2_{\eta}$, respectively. Assume, also, that x and η are independent. Then show that the LS estimator $$\theta=\frac{\sum^N_{n=1}{x_n y_n}}{\sum^N_{n=1}{x_n^2}}$$ achieves the CR bound only asymptotically.

It needs the pdf of $y_n$ which is the sum of two independent random variables, am I supposed to use the convolution formula (the theorem of summation of two random variables)? It seems the calculation would go extremely difficult because of the integration:$$\text{y}\sim\int_{\mathbb{R}}{\frac{1}{\theta}p(\frac{y}{\theta})\frac{1}{\sqrt{2\pi \sigma_{\eta}^2}}\text{exp}(-\frac{(y-u)^2}{2\sigma_{\eta^2}})\text{du}}, \text{where}\,\,p\,\,\text{is the pdf of x}$$ there is a $\theta$ in $p$ which mostly causes the computational difficulty.

Any helps or hints will be appriciated.

Best Answer

Edit: Included computation of the cramer-Rao-bound

Note that $Y|X \sim \mathcal{N}(\theta X;\sigma_\eta^2)$

We have that $I_{(X,Y)}(\theta) = I_{Y|X}(\theta) = \mathbb{E}_{\theta,x} [ \frac{\partial \ln p(y|x)}{\partial \theta}^2] = \mathbb{E}_{\theta,x}[\frac{X^2}{\sigma_\eta^2}] = \frac{\sigma_x^2}{\sigma_\eta^2} $

By the "chain rule" for fisher information.

Also, for the asymptotics you will not need to evaluate the density if you do it in the following way:

First, separate the terms:

$\frac{\sum_n x_n y_n}{\sum_n x_n^2} = \theta \frac{\sum_n x_n^2}{\sum_n x_n^2} + \frac{\sum_n x_n \eta_n}{\sum_n x_n^2} = \theta + \frac{\sum_n x_n \eta_n}{\sum_n x_n^2} $

Now check the asymptotics:

$ \sqrt{n}(\theta + \frac{\sum_n x_n \eta_n}{\sum_n x_n^2} - \theta) = \sqrt{n} ( \frac{\sum_n x_n \eta_n}{\sum_n x_n^2}) = \sqrt{n} ( \frac{\frac{\sum_n x_n \eta_n}{n}}{\frac{\sum_n x_n^2}{n}}) = \sqrt{n} \frac{\overline{X \eta}}{\overline{X^2}}$

Now, by the law of large numbers, $ \overline{X^2} \overset{P}{\rightarrow} \sigma_x^2 $

And, by the central limit theorem (and the independence of $X$,$\eta$): $\sqrt{n} \overline{X \eta} \overset{D}{\rightarrow} \mathcal{N}(0,\sigma_x^2 \sigma_\eta^2)$

Hence, by the Slutsky theorem/continuos mapping theorem, you get that:

$ \sqrt{n} \frac{\overline{X \eta}}{\overline{X^2}} \overset{D}{\rightarrow} \mathcal{N}(0,\frac{\sigma_\eta^2}{\sigma_x^2})$

which is the inverse of the Fisher information, as required.

Related Question