[Math] Maximizing the expectation of a polynomial function of iid random variables

expectationpr.probabilityprobability distributions

Let $f \colon \mathbb R^N \to \mathbb R$ be a smooth function. Let $\mu$ be a probability measure on $[0,1]$ and $X_1, \ldots , X_N$ be i.i.d. random variables on $\mathbb R$.

Question 1. What is the maximum value of the expectation
$$
\mathbb E[\vert f(X_1, \ldots , X_N) \vert] = \int_{\mathbb [0,1]^N} \vert f(x_1, \ldots , x_N) \vert d\mu(x_1) \ldots d\mu(x_N)
$$

among all probability measures $\mu$ on $[0,1]$?

This question arises from this post and from this recent answer of Sangchul Lee which seem however tailored for specific functions $f$ and particularly for the case $N=2$.

I am very interested in the case $N\ge 3$; the function $f$ maybe be as smooth as needed (e.g. a polynomial). I have troubles in extending the (very elegant) variational approach of Sangchul Lee's to more variables, as no "bilinear form" is available.

After Nate Eldredge's comment, it is not restrictive to consider only a.c. measures. Furthermore, we can initially consider a somehow easier case, where $f$ is a polynomial (then maybe work by uniform approximation). In addition, thanks to a clever "random thought" due to Pierre PC (see comments below), we can start assuming symmetry of the integrand. All in all, the problem "boils down" to the following:

Question 1 (easier version): Maximize
$$
\int_{[0,1]^N} p(x_1, \ldots, x_N) g(x_1) g(x_2) \ldots g(x_N) dx_1 dx_2 \ldots dx_N,
$$

among functions $g \ge 0$ such that $\int_0^1 g(s)\, ds = 1$, being $p \in \mathbb R[x_1, \ldots , x_N]$ a symmetric polynomial (of constant sign on [0,1]?).

Can this be handled by the general Holder's inequality?


Since the general case seems too hard, let me propose a toy model we can begin with. To me, it seems the difficult point is not the integrand (in view of the comment above general reduction arguments should lead to a very simple form of it) but rather the fact that $N \ge 3$ (in comparison with S. Lee's approach, which is tailored for $N=2$).

So here we go:

Question 2. (baby version) What is the maximum value of the expectation
$$
\mathbb E[|(X-Y)(Y-Z)(Z-X)|] = \int_{[0,1]^3} |(X-Y)(Y-Z)(Z-X)| d\mu(X) d\mu(Y)d\mu(Z)
$$

among all probability measures $\mu$ on $[0,1]$?

Best Answer

We can in principle find the optimal first $n$ moments for a polynomial approximation of $f$. This reduces the problem to a truncated Hausdorff moment problem, for which solutions exist.

Let $f \colon [0,1]^d \to \mathbb{R}$ be a continuous function. Approximate $|f|$ with the multivariate Bernstein polynomials of order $n$ (writing $f_{k_1\cdots k_d} = f(k_1/n, \ldots k_d/n)$ for short),

$$|f_n|(x) = \sum_{k_1 \cdots k_d} |f_{k_1\cdots k_d}| \prod_{i=1}^{d} \binom{n}{k_i} \, x_i^{k_i} (1-x_i)^{n-k_i} \;.$$

Let $S_k = E[X^k (1-X)^{n-k}]$. The expectation of the approximant is then

$$E[|f_n|(X)] = \sum_{k_1 \cdots k_d} |f_{k_1\cdots k_d}| \prod_{i=1}^{d} \binom{n}{k_i} \, S_{k_i} \;,$$

which should be maximal. The value of $S_{k}$ is a linear combination on the first $n$ moments $M$ of the resulting distribution. Expanding

$$S_k = \sum_{r=0}^{n} \binom{n-k}{r-k} (-1)^{r-k} \, M_r \;,$$

the expectation in terms of the moments is

$$E[|f_n|(X)] = \sum_{k_1 \cdots k_d} \sum_{r_1 \cdots r_d} |f_{k_1\cdots k_d}| \prod_{i=1}^{d} \binom{n}{r_i} \binom{r_i}{k_i} (-1)^{r_i-k_i} \, M_{r_i} \;.$$

The problem is hence the constrained multilinear optimisation

$$ \begin{aligned} \max_{M_0 \cdots M_n} \;&\; \sum_{k_1 \cdots k_d} \sum_{r_1 \cdots r_d} |f_{k_1\cdots k_d}| \prod_{i=1}^{d} \binom{n}{r_i} \binom{r_i}{k_i} (-1)^{r_i-k_i} \, M_{r_i} \;, \\ \text{s.t.} \;&\; \sum_{r=0}^{n} \binom{n-k}{r-k} (-1)^{r-k} \, M_r \ge 0 \;, \\ &\; M_r \in [0,1] \;, \quad M_0 = 1 \;. \end{aligned} $$

To make the problem slightly more tractable, the relation between $S_k$ and $M_r$ can be inverted using the binomial partial sums,

$$M_r = \sum_{k=0}^{n} \binom{n-r}{k-r} \, S_k \;,$$

so that the optimisation becomes

$$ \begin{aligned} \max_{S_0 \cdots S_n} \;&\; \sum_{k_1 \cdots k_d} |f_{k_1\cdots k_d}| \prod_{i=1}^{d} \binom{n}{k_i} \, S_{k_i} \;, \\ \text{s.t.} \;&\; \sum_{k=0}^{n} \binom{n-r}{k-r} \, S_k \in [0,1] \;, \\ &\; \sum_{k=0}^{n} \binom{n}{k} \, S_k = 1 \;, \quad S_k \ge 0 \;. \end{aligned} $$

However, the first constraint is actually superfluous, as it is implied by the normalisation. Letting further $p_k = \binom{n}{k} S_k$, we have the optimisation problem in its most basic form,

$$ \begin{aligned} \max_{p_0 \cdots p_n} \;&\; \sum_{k_1 \cdots k_d} |f_{k_1\cdots k_d}| \prod_{i=1}^{d} p_{k_i} \;, \\ \text{s.t.} \;&\; \sum_{k=0}^{n} p_k = 1 \;, \quad p_k \ge 0 \;. \end{aligned} $$

which is simply the maximum expectation of the discretised function $|f_{k_1 \cdots k_d}|$ from the discrete distribution $p$.

From a cursory Google search, multilinear optimisation appears difficult (but I have no idea, really). Nevertheless, it is a recipe for calculations: For the baby problem, I find (with $n = 50$) the expectation 0.0615760 and distribution $p$ as below.

discrete distribution

Related Question