In Probability and Computing (Mitzenmacher & Upfal), Markov inequality is proven roughly as follows:
(Theorem) Markov's Inequality. Let $X$ be a random variable that assumes only nonnegative values. Then, for all $a > 0$,
$$
\Pr(X \geq a) \leq \frac{\mathbb{E}[X]}{a}
$$
Proof:
For $a > 0$, let
- $I = 1 \text{ if } X \geq a$
- $I = 0 \text{ otherwise }$
We also have
$$
I \leq \frac{X}{a}
$$
and
$$
\mathbb{E}[I] = \Pr(X \geq a)
$$
(so far so good)
Now they "take expectation" of the inequality
$$
\mathbb{E}[I] \leq \mathbb{E} \left[ \frac{X}{a} \right]
$$
Why is this a "legitimate move"? Let's simplify:
$$
g(X) \leq X \implies \mathbb{E}[g(X)] \leq \mathbb{E}(X)
$$
Maybe it's very easy but why does this inequality with expectations hold? Does it have something to do with Jensen's inequality? Or something even simpler?
Best Answer
Thanks @sbares and @angryavian! Here I summed up their answers:
$$g(X) \leq X \iff 0 \leq X - g(X)$$
Expectation of a non-negative random variable (here: $X - g(X)$) must be also non-negative, so
$$0 \leq X - g(X) \implies 0 \leq \mathbb{E}[X - g(X)]$$
Now by linearity of expectation
$$ 0 \leq \mathbb{E}[X - g(X)] \iff 0 \leq \mathbb{E}[X] - \mathbb{E}[g(X)] \iff \mathbb{E}[g(X)] \leq \mathbb{E}[X] $$
(just substitute $I$ and $\frac{X}{a}$ into the original proof above)