Minimising expected square of differences to random variables

probability

Let us say we have three random variables $X_1$, $X_2$, and $X_3$ with joint distribution $P(X_1, X_2, X_3)$. I want to find the best (non-random) function $\theta_P (x_1, x_2, x_3)$ that minimises the expected value
$$
\mathbb E \left[ \left( (\theta_P(X_1, X_2, X_3) – X_M \right)^2\right] \to \min,
$$

where $M$ is uniform over $\{1,2,3\}$ and independent of $X_1, X_2, X_3$. Expectation is taken over the distribution $P$ and the distribution of $M$.

Also, I am interested in another function, $\xi_P(x_1,x_2,x_3)$ that minimises the following expected value:
$$
\mathbb E \left[ \max_i \left( \xi_P(X_1, X_2, X_3) – X_i \right)^2 \right] \to \min.
$$

Expected value is taken over the distribution $P$. Most probably, both $\theta_P$ and $\xi_P$ will depend on the distribution $P$ (that's why the subscript). One of the candidates for both cases can be $$\theta(x_1, x_2, x_3) = \xi(x_1, x_2, x_3) = \frac{x_1 + x_2 + x_3}{3}$$ but I am not at all sure this give the minimum expectations.

In some sense, both $\theta_P$ and $\xi_P$ are the closest (on average) points from $X_1, X_2, X_3$. The distance is a square of the Euclidean distance but, for minimisation problem, this distance will be equivalent to the Euclidean. (Right?)

Best Answer

I think I have an answer.

First case, $\theta_P$.

The first observation is that since $M$ is uniform and independent over $\{1,2,3\}$, we have that $$ \mathbb E[((\theta_P(X_1, X_2, X_3) - X_M)^2] = \frac 13 \mathbb E \left[\sum_{m=1}^3 (\theta_P(X_1, X_2, X_3) - X_m)^2 \right], $$ where the latter does not involve random $M$ any more.

The second fact (very obvious!) is that if $A \le B$ then $\mathbb E[A] \le \mathbb E[B]$. Hence, we can try to find the solution as $$ \theta_P(x_1, x_2, x_3) = \arg \min_t \left( \sum_{m=1}^3 (t-x_m)^2 \right) $$ where $x_1, x_2, x_3$ are treated as constants. In other words, we try to minimise $\theta_P$ in each of the points. By using basic calculus, the optimal $t$ is then $$ t = \frac{x_1 + x_2 + x_3}3, $$ which is also the expression for the optimal $\theta_P(x_1, x_2, x_3)$.

Surprisingly enough, the solution does not depend on the distribution of $(X_1, X_2, X_3)$.

Second case, $\xi_P$.

Again, we do optimisation for each triple of fixed values $(X_1, X_2, X_3)$ separately, i.e. $$ \xi_P(x_1, x_2, x_3) := \arg \min_t \left( \max \left( (t-x_1)^2, (t-x_2)^2, (t-x_3)^2 \right) \right). $$

We observe that $$ \max \left( (t-x_1)^2, (t-x_2)^2, (t-x_3)^2 \right) = \max \left( (t-x_{\min})^2, (t-x_{\max})^2 \right), $$ where $x_{\min} = \min(x_1, x_2, x_3)$ and $x_{\max} = \max(x_1, x_2, x_3)$. Finally, $$ \max( (t-x_\min)^2, (t-x_\max)^2) = \begin{cases} (t-x_\max)^2, & t < \frac{x_\min + x_\max}{2},\\ (t-x_\min)^2, & t \ge \frac{x_\min + x_\max}{2}, \end{cases} $$ and the global minimum is achieved at $t = \frac{x_\min + x_\max}{2}$. Altogether, $$ \xi_P(x_1, x_2, x_3) = \frac{\min(x_1, x_2, x_3) + \max(x_1, x_2, x_3)}2. $$ Again, it doesn't depend on distribution of $X_1, X_2, X_3$.