Solved – Convergence in Probability of Empirical Median

asymptoticsconvergenceestimationprobabilityself-study

I'm stuck with this one.

Let $X_1,…X_n$ be an i.i.d. sequence of random variables with CDF F. The empirical CDF of $X_i$ is defined
$$
\hat F_n(x) = \frac{1}{n} \sum_{1 \leq i \leq n} I\{X_I \leq x \}.
$$
Note that for $x \in \Re$, $\hat F_n(x) \xrightarrow{p} F(x)$. Also, define the smallest median of P as
$$
\theta_0 = \inf \{ x \in \Re : F(x) \geq .5 \}.
$$
Suppose that the median is unique, i.e., for any $\epsilon > 0$, $P(X_i \leq \theta_0 + \epsilon) > .5$. Define an estimator $\hat \theta_n$ of $\theta_0$ by
$$
\hat \theta_0 = \inf \{x \in \Re : \hat F_n(x) \geq .5 \}.
$$

Show that $\hat \theta \xrightarrow{p} \theta_0$.

Best Answer

We wish to show that $\hat \theta_n \xrightarrow{P} \theta_0$. By definition, this states that given $\epsilon > 0$ $$ P(|\hat \theta_n - \theta_0| > \epsilon) \rightarrow 0 \text{ as } n \rightarrow \infty. $$

Given $X_1,..., X_n$ are iid with finite $n$, we see that if we were to reorder $\{X_i\}$ in ascending order, the definition of $\hat \theta_n$ gives $\hat \theta_n = X_{\lceil n/2 \rceil}$. From this we have that $$ \hat F_n(\hat \theta_n) = \left \{ \begin{array}{ll} 1/2 & \text{ if n is even} \\ \frac{1}{2} + \frac{1}{2n} & \text{ if n is odd.} \end{array} \right. $$

Using this, we can calculate \begin{align} P(|\hat \theta_n - \theta_0| > \epsilon) &\leq P(|\hat \theta_n - \theta_0| \geq \epsilon) \\ &= P(\hat \theta_n \leq \theta_0 - \epsilon) + P(\theta_n + \epsilon \leq \hat \theta_n) \\ &= P(\hat F_n(\hat \theta_n) \leq \hat F_n(\theta_0 - \epsilon)) + P(\hat F_n(\theta_n + \epsilon) \leq \hat F_n(\hat \theta_n)) \end{align} where the last equality is due to the fact that $\hat F_n(x)$ is non-decreasing. Notice that because $\hat F_n(x)$ is non-decreasing and not strictly increasing, it can only preserve weak inequalities.

We continue by analyzing the terms individually. First, consider the term $$ P(\hat F_n(\hat \theta_n) \leq \hat F_n(\theta_0 - \epsilon)). $$ When $n$ is even, \begin{align*} P(1/2 \leq \hat F_n(\theta_0 - \epsilon)) &= P(1/2 - F(\theta_0 - \epsilon)) \leq \hat F_n(\theta_0 - \epsilon) - F(\theta_0 - \epsilon))\\ &\rightarrow 0 \text{ as } n \rightarrow \infty \end{align*} where this last fact follows from the fact that $1/2 - F(\theta_0 - \epsilon) > 0 $ (we can see this from the defintion of the median) and that $\hat F_n(x) \xrightarrow{p} F(x)$, as we demonstrated previously. Similarly, when $n$ is odd, \begin{align*} P(1/2 + \frac{1}{2n} \leq \hat F_n(\theta_0 - \epsilon)) &= P(1/2 + \frac{1}{2n} - F(\theta_0 - \epsilon)) \leq \hat F_n(\theta_0 - \epsilon) - F(\theta_0 - \epsilon))\\ &\rightarrow 0 \text{ as } n \rightarrow \infty. \end{align*} Now consider the term $$ P(\hat F_n(\theta_n + \epsilon) \leq \hat F_n(\hat \theta_n)). $$ Similar to before, we have \begin{align*} P(\hat F_n(\theta_n + \epsilon) \leq \hat F_n(\hat \theta_n)) &= P(-\hat F_n(\theta_n + \epsilon) \geq -\hat F_n(\hat \theta_n)) \\ &= P(F(\theta_n + \epsilon) -\hat F_n(\theta_n + \epsilon) \geq F(\theta_n + \epsilon) -\hat F_n(\hat \theta_n)) \\ &\rightarrow 0 \text{ as } n \rightarrow \infty. \end{align*} This last fact comes again from the fact that that $\hat F_n(x) \xrightarrow{p} F(x)$ and that $F(\theta_n + \epsilon) -\hat F_n(\hat \theta_n) > 0$, which we have because the uniqueness of the median gives that for $\epsilon > 0$ we have $F(\theta_0 + \epsilon) > .5$.

These two results, applied back give \begin{align*} P(|\hat \theta_n - \theta_0| > \epsilon) &\leq P(|\hat \theta_n - \theta_0| \geq \epsilon) \nonumber \\ &\rightarrow 0 \text{ as } n \rightarrow \infty \end{align*} and thus $$ \hat \theta_n \xrightarrow{p} \theta_0. $$

Related Question