Proving that the probabilities for which the variance of a distribution is greater than some number is convex

convex optimizationconvex-analysisprobabilityvariance

From Boyd & Vandenberghe's Convex Optimization, exercise 2.15 (f) and (g) — I am following along with the solutions from here.

2.15 Some sets of probability distributions. Let $x$ be a real-valued random variable with $\textbf{prob}(x = a_i) = p_i, i = 1,\dots, n$, where $a_1 < a_2 < \dots < a_n$. Of course $p \in {\bf R}^n$ lies in the standard probability simplex $P = \left\{ p \mid \mathbf{1}^T p = 1, p \succeq 0 \right\}$. Which of the following conditions are convex in $p$? (That is, for which of the following conditions is the set of $p \in P$ that satisfy the condition convex?)


For (f) and (g), the questions relate to the variance, where

$$\operatorname{var} (x) = {\Bbb E} (x^2) – {\Bbb E} (x)^2$$

and

$${\Bbb E} (x) = \sum_{i=1}^{n}{a_ip_i}$$

(f) asks whether $var(x) \le \alpha$ is a sufficient condition for convexity

(g) asks whether $var(x) \ge \alpha$ is a condition sufficient for convexity

In the solutions, the logic for why (f) is not convex makes sense to me, but the logic for why (g) is convex does not. The solution claims that the constraint in (g) is equivalent to:

$$\sum_{i=1}^{n}{a_i^2p_i} + \left(\sum_{i=1}^{n}{a_ip_i}\right)^2 = b^Tp + p^TAp \le \alpha$$

and thus must be convex because $aa^t$ is positive semidefinite (and presumably adding the convex positive semidefinite cone to the projection along the probabilities maintains the convexity?), but I do not understand how they jumped straight to that from:

$$\sum_{i=1}^{n}{a_i^2p_i} – \left(\sum_{i=1}^{n}{a_ip_i}\right)^2 = b^Tp – p^TAp \ge \alpha$$

implied by

$$\operatorname{var} (x) = {\Bbb E} \left( x^2 \right) – {\Bbb E} (x)^2 \ge \alpha$$

Shouldn't it instead be

$$ p^T A p – b^T p \le \alpha$$

and no longer guarantee convexity in $p$?

Best Answer

Let's go over the derivation again. We have \begin{align*} \text{var}(x) & = E(x^2) - (E(x))^2\\ & = \sum_{i = 1}^n a_i^2 p_i - \Big( \sum_{i = 1}^n a_i p_i \Big)^2\\ & = b^T p - p^T A p, \end{align*} where $b = \begin{bmatrix} a_1^2\\ \vdots\\ a_n^2 \end{bmatrix}$, $A = aa^T$.

So $\text{var}(x) \geq \alpha$ is equivalent to $p^T A p - b^T p \leq -\alpha$. (I don't know why the solution manual wrote $b^T p + p^T A p \leq \alpha$. It seems like a sign mistake to me. Nevertheless, it does not affect the conclusion.)

Define function $f(p) = p^T A p - b^T p$. The important point is that since $A = aa^T$ is positive semidefinite, $f(p)$ is a convex function (see Chapter 3 of your book). $\{p \in \mathbb{R}^n \ | \ f(p) \leq -\alpha \}$ is a sublevel set of a convex function, which must be convex.

So the set of probabilities $p$ satisfying $\text{var}(x) \geq \alpha$ is $\{p \in \mathbb{R}^n \ | \ f(p) \leq -\alpha\} \cap \{p \ | \ \mathbf{1}^T p = 1, p \succeq 0\}$. This is a convex set.

Related Question