Prove that minimizing the variance is equal to minimize the pairwise distance

optimizationproof-writingvariance

With reference to this answer for the question "Best estimation of a fitting parameter to measured data", can you prove that minimizing the Estimated Variance provides the same solution as minimizing the Pairwise Distance?

For reference, I copied the answer, which was proved using Mathematica for $N=20$, but not for a generic $N$:
$$\sum _{i=1}^N \left(y_i-\frac{\sum _{j=1}^N y_j}{N}\right)^2-\frac{\sum _{i=1}^N \sum _{j=i}^N \left(y_i-y_j\right){}^2}{N}=0, \text{with } N\to20$$

I try to write the request more formally (I hope it makes sense):
$$\arg\min_{\alpha \in \mathbb{R}} \mathrm{Var}[Y(\alpha)] = \arg\min_{\alpha \in \mathbb{R}} \sum\limits_{\substack{i=1\\j=i+1}}^{N} \left(y_i(\alpha)-y_j (\alpha) \right)^2$$
where $\alpha$ is the parameter to estimate.

Best Answer

\begin{align} &\sum_{i=1}^{N-1} \sum_{j=i+1}^N (y_i - y_j)^2 \\ &=\sum_{i=1}^{N-1} \sum_{j=i+1}^N (y_i -\bar{y}+\bar{y}- y_j)^2 \tag{1} \\ &=\sum_{i=1}^{N-1} \sum_{j=i+1}^N (y_i -\bar{y})^2+\sum_{i=1}^{N-1} \sum_{j=i+1}^N ( y_j-\bar{y})^2 - 2\sum_{i=1}^{N-1} \sum_{j=i+1}^N (y_i-\bar{y})(y_j-\bar{y}) \tag{2}\\ &=\sum_{i=1}^{N-1} \sum_{j=i+1}^N (y_i -\bar{y})^2+\sum_{j=2}^N\sum_{i=1}^{j-1} ( y_j-\bar{y})^2 - \left( \sum_{i=1}^N \sum_{j=1}^N (y_i-\bar{y})(y_j-\bar{y}) - \sum_{i=1}^N(y_i - \bar{y})^2 \right) \tag{3}\\ &=\sum_{i=1}^{N-1} (N-i)(y_i -\bar{y})^2+\sum_{j=2}^N(j-1)( y_j-\bar{y})^2 \\&- \left( \sum_{i=1}^N (y_i-\bar{y})\sum_{j=1}^{N}(y_j-\bar{y}) - \sum_{i=1}^N(y_i - \bar{y})^2 \right) \tag{4}\\ &=\left((N-1)(y_1-\bar{y})^2+ \sum_{i=2}^{N-1} (N-i)(y_i -\bar{y})^2\right)\\&+\left(\sum_{j=2}^{N-1}(j-1)( y_j-\bar{y})^2 + (N-1)(y_N-\bar{y})\right) - \left(0 - \sum_{i=1}^n(y_i - \bar{y})^2 \right) \tag{5}\\ &=(N-1)(y_1-\bar{y})^2+ \sum_{i=2}^{N-1} (N-1)(y_i -\bar{y})^2 +(N-1)(y_N-\bar{y})+ \sum_{i=1}^n(y_i - \bar{y})^2 \tag{6} \\ &= (N-1) \sum_{i=1}^N (y_i - \bar{y})^2 + \sum_{i=1}^N (y_i - \bar{y})^2\\ &= N \sum_{i=1}^N (y_i - \bar{y})^2 \end{align}

  • The first equality is due to we subtract and add $\bar{y}$.
  • The second equality is due to we use the equation $(a-b)^2=a^2+b^2-2ab$.
  • The third equality is due to consider a matrix $A$ where the $(i,j)$ entry is $(y_i-\bar{y})(y_j-\bar{y})$, then this is a symmetric matrix and the sum of the off-diagonal entries is equal to the sum of all the entries subtract away the diagonal entries. Also, in the second term, switch the summation order between $i$ and $j$.
  • The fourth equality. For the first two terms, we simplify the inner summation and for the the third, we factor $(y_i-\bar{y})$.
  • In the fifth equality, we handle $i=1$ and $j=N$ separately. The last term is due to $\sum_{j=1}^N (y_j -\bar{y})=0$.
  • In the sixth equality, we combine the same index.
Related Question