Normal Distribution – Median of Sum vs. Sum of Median for Gaussian Variables

chi-squared-distributionconcavitymediannormal distribution

This is a problem I stumbled upon in my research. Consider $n$ Gaussian random variables $x_i \sim \mathcal{N} (\mu_i, \sigma_i^2)$, each with its own mean $\mu_i$ and variance $\sigma_i^2$. Can we say that
\begin{equation}
\text{Median} \left( \sum_{i=1}^n x_i^2 \right) \ge \sum_{i=1}^n \text{Median} \left( x_i^2 \right) \; ?
\end{equation}

If the $x_i$ were standard normal variables ($\mu_i = 0$ and $\sigma_i^2=1$) we could compute $\text{Median} (x_i^2) \simeq 0.4549$, and the sum $\sum_{i=1}^n x_i^2 $ would follow the $\chi^2$-distribution with $n$ degrees of freedom, which is reported to have median $\simeq n \left( 1 – \frac{2}{9n}\right)^3$. The above inequality would be
\begin{equation}
\left( 1 – \frac{2}{9n}\right)^3 \ge 0.4549
\end{equation}

which is satisfied $\forall n \in \mathbb{N}$. The question is whether the inequality is true even for non standard normal variables. The fact that it holds for standard normal variables is related to the heavy-tailedness of the distribution of $x_i^2$, so I think it might hold also for general Gaussian variables.

Best Answer

Here is a simple proof, by comparison with appropriate symmetric variables.

For a normal variable $X$, let $m$ be the median of $X^2$. The graphs show $X\sim N(4,1)$.

Now we take a variable $Y$ which is a version of $X^2$ but symmetrized from right to left about $m$:

enter image description here

enter image description here

In formulas: $$ f_Y(y) = \begin{cases} f_{X^2}(y)\phantom{2m-\, }\ \text{ if }\, y>m\\ f_{X^2}(2m-y)\ \text{ if }\, y<m\\ \end{cases}$$ $$F_Y(y) = \begin{cases} \phantom{1-\, }F_{X^2}(y)\phantom{2m-\, }\ \text{ if }\, y>m\\ 1-F_{X^2}(2m-y)\ \text{ if }\, y<m\\ \end{cases} $$

Since the $Y_i$'s are symmetric with median $m_i$, the values near $(y_1, \ldots y_n)$ and $(2m_1-y_1, \ldots 2m_n-y_n)$ are equally probable and on opposite sides of $m_1+\cdots+m_n$. So $m_1+\cdots+m_n$ must be the median of their sum.

Also $X^2$ dominates $Y$, i.e. $F_{X^2}(y)\le F_Y(y)$, as can be seen in graphs like the above. Thus:

$$ \begin{align} \text{median}\left(\sum X_i^2\right) &\ge \text{median}\left(\sum Y_i\right)\ &\text{ (by the dominance of the }X\text{'s)}\phantom{\ \square}\\ &=\sum\text{median}(Y_i)\ &\text{ (by the symmetry of the }Y\text{'s)}\phantom{\ \square}\\ &=\sum\text{median}(X_i^2)\ &\text{ (by the construction of the }Y\text{'s)}\ \square \end{align} $$

Related Question