Bound $\|\mathbb{E} [ \boldsymbol{H} (x-y)]\|^2$ for $\mu \preccurlyeq \boldsymbol{H} \preccurlyeq L$

convex optimizationconvex-analysisexpected valuematrix-calculusoptimization

Suppose function $F$ is twice-differentiable, L-lipschitz smooth and $\mu$-strongly convex. Let $\boldsymbol{x}(\xi), \boldsymbol{y}(\xi) \in \mathbb{R}^d$, for a random variable $\xi$. Denote $\boldsymbol{H}=\int_0^1 \nabla^2 F(\boldsymbol{y}+s(\boldsymbol{x}-\boldsymbol{y})) \mathrm{d} s$. Then, prove that
$$
\mu\langle\mathbb{E}[\boldsymbol{x}-\boldsymbol{y}], \mathbb{E}[\boldsymbol{H}(\boldsymbol{x}-\boldsymbol{y})]\rangle \leq \mathbb{E}\langle\boldsymbol{H}(\boldsymbol{x}-\boldsymbol{y}), \mathbb{E}[\boldsymbol{H}(\boldsymbol{x}-\boldsymbol{y})]\rangle \leq L\langle\mathbb{E}[\boldsymbol{x}-\boldsymbol{y}], \mathbb{E}[\boldsymbol{H}(\boldsymbol{x}-\boldsymbol{y})]\rangle .
$$

My approach to this was to first note that $\mu \preccurlyeq \boldsymbol{H} \preccurlyeq L$. But to be honest, after this I'm completely stuck. Would anyone know how to get this inequality? The definitions of $L$-smoothness and $\mu$-strongly convex are the same as the standard version, e.g. same as here: How to get regularity condition from smooth and strong convex conditions?. Apparently, the potential hint (which may be wrong) is that the integral form of $\boldsymbol{H}$ is rather irrelevant, and it is sufficient to just use $\mu \preccurlyeq \boldsymbol{H} \preccurlyeq L$.

Best Answer

Counterexample to stated claim.

Consider the function $f(x) = x \tan^{-1}(x) - (1/2) \log(1+x^2)$, which is convex and has $f''(x) = 1/(1+x^2)$. On any bounded domain $[-M, M]$, then $$ \frac{1}{1 + M^2} \leq f''(x) \leq 1 $$ Thus, we see that $f(x)$ satisfies $\mu = (1+ M^2)^{-1}$ strong convexity and $1$-smoothness.

On the other hand, $$ H(x, y) = \int_0^1 f''(x + t(y-x)) \, dt = f'(y) - f'(x) = \tan^{-1}(y) - \tan^{-1}(x). $$

Thus, let us consider $y \equiv 0$ and $x$ which is equiprobably $\pm 1$.

Then $H(x, y) (x - y) = -\tan^{-1}(x) x$. As a consequence, $$ \mathbb{E}[H(x, y) (x - y)] = -\pi/ 4 $$ Your inequality in this case---using that $\mathbb{E}[x-y] =0$, then reads as $$ 0 \leq \frac{\pi^2}{16} \leq 0, $$ which is obviously absurd.


Extension to a globally strongly convex and smooth function.

Let us pick any $M > 0$ and define the function

$$ g_M(x) = \begin{cases} f(x), & |x| \leq M \\ \tfrac{1}{2(1+ M^2)}(x-M)^2 + \tan^{-1}(M)(x-M) + f(M) & x > M \\ \tfrac{1}{2(1+ M^2)}(x+M)^2 + \tan^{-1}(-M)(x+M) + f(-M) & x < -M \end{cases} $$ which is a quadratic extension of $f$.

It is easily verified that $g_M$ is twice differentiable and moreover $$ g''_M(x) = \begin{cases} f''(x) & |x| \leq M \\ \tfrac{1}{1 + M^2} & |x| > M \end{cases}, $$ which satisfies $g''_M(x) \in [\tfrac{1}{1+M^2}, 1]$ globally, i.e., for any real $x$.

Thus $g''_M$ is $(1+M^2)^{-1}$-strongly convex and $1$-smooth.

The previous counterexample thus extends to $g_M$ as soon as $M > 1$, since $g_M = f, g'_M = f', g''_M = f''$ on $[-1, 1]$, which contains the support of the random variables $x, y$.


What is actually true.

Using the fact that $\mu I \preceq H\preceq L I$, it is easy to see that for any $x, y, H$ $$ \mu \langle H(x- y), x-y \rangle \leq \langle H(x-y), H(x-y) \rangle \leq L \langle H(x -y), x-y \rangle. $$ (This follows by noting that $\mu H \preceq H^2 \preceq L H$.)

Taking expectations, I would imagine that this is the desired result: $$ \mu \mathbb{E}[\langle H(x- y), x-y \rangle] \leq \mathbb{E}[\langle H(x-y), H(x-y) \rangle] \leq L \mathbb{E}[\langle H(x -y), x-y \rangle]. $$