[Math] Linear MMSE estimate of MMSE estimator

mean square errorprobabilityprobability theory

This question is prompted by a recent discussion (Conditional Expectation a Decreasing Function Implies Covariance is nonpositive) about the relationship between conditional expectation and covariance.

Suppose that $X$ and $Y$ are zero-mean unit-variance random variables with covariance (and correlation coefficient) $\rho$. The
minimum-mean-square error (MMSE) estimator of $Y$ given $X$ is
the random variable $g(X)$ that minimizes $E[(Y-g(X))^2]$, and as is
well known, $$g(X) = E[Y \mid X] ~\text{minimizes}~E[(Y-g(X))^2]$$
It is also well known that
$E[g(X)] = E[E[Y\mid X]] = E[Y] = 0$. In general,
$g(X)$ is a nonlinear function.
On the other hand, if the estimator is restricted to being
of the form $\hat{Y} = aX + b$ where $a$ and $b$ are real numbers,
then the linear MMSE estimator of $Y$ given $X$ is
$\hat{Y} = \rho X$, that is,
$$a = \rho, ~ b = 0, ~\text{minimizes}~E[(Y-aX-b)^2].$$
The linear MMSE estimator $\rho X$ has a mean-square-error
$E[(Y-\rho X)^2] = 1 – \rho^2$ and so the mean-square-error
of the MMSE estimator $g(X)$ can be no larger:
$$E[(Y-g(X))^2] \leq 1 – \rho^2.$$

A simplified version of the question in the
previous discussion is:
if $g(\cdot)$ is a decreasing
function of its argument, show that $\rho$ is nonpositive.

My question is: what is the linear MMSE estimate of $g(X) = E[Y \mid X]$
given $X$? That is, what choice of real numbers $c$ and $d$ minimizes
$E[(g(X) – cX – d)^2]$? Since $g(X)$ and $X$ both have zero mean and
$X$ has unit variance, standard linear MMSE estimator theory gives that
$d = 0$ and
$$c = \frac{\text{cov}(g(X),X)}{\text{var}(X)} = \text{cov}(g(X),X) = E[Xg(X)]$$
which I think might work out to be $\rho$, but I am not sure about this.
Any suggestions on how to proceed further would be appreciated.

Best Answer

Your conjecture is correct. By the law of total expectation we have $$ \begin{align} E(X(Y-g(X)) & = E(\;E(X(Y-g(X))\mid X)\;) \\ \\ & = E(\; E(XY\mid X) - E(Xg(X)\mid X)\;) \\ \\ & = E(\; XE(Y\mid X) - Xg(X) \;) \\ \\ & = E( Xg(X) - Xg(X)) = 0. \end{align} $$ Therefore $$ E(XY) = E(Xg(X)). $$

Related Question