Prove that $\arg\max_{t}\langle \sum w_i a_{t_i}, a_t\rangle$ is close to $t_1$

gaussianoptimization

Let $a_t$ be a Gaussian type function with center $t\in \mathbb R^d$, i.e.
$$a_t(x)=e^{-||x-t||^2}, \text{ for } x \in \mathbb R^d.$$

Let $b$ be a combination of these Gaussian functions
$$b = \sum_{i=1}^s w_i a_{t_i}$$
where $t_i \in \mathbb R^d$ for all $i=1,…, s$ and $w_1 > …. > w_s>0$ is a finite sequence of decreasing positive numbers.

Define $$\hat t = \arg\max_{t\in \mathbb R^d} \langle b, a_t\rangle.$$

Here the inner product $\langle b, a_t\rangle$ is defined as $\int_{x\in \mathbb R^d} b(x) a_t(x)dx$.

Intuitively, we can easily see that if the Gaussian functions are far away of each others, then $\hat t$ will be close to $t_1$, i.e. we have

Claim. If $\Delta= \min_{i\neq j; i, j\leq s}||t_i-t_j||\rightarrow +\infty$ then $\hat t \rightarrow t_1$.

I am wondering how to prove the claim rigorously. I have no idea since the function $\langle b, a_t\rangle$ is not even convex. Is there any idea to tackle this problem? Thanks!

Best Answer

Define $D$ as the following constant: $$ D= \int_{x \in \mathbb{R}^d} e^{-2||x||^2}dx$$ For the norm $||x||^2 = \sum_{i=1}^d x_i^2$ we have the following fact: For any $x, a, b \in \mathbb{R}^d$: $$ ||x-a||^2 + ||x-b||^2 = 2||x-\frac{(a+b)}{2}||^2 + \frac{1}{2}||a-b||^2$$ Define $g:\mathbb{R}^d\rightarrow\mathbb{R}$ by $$ g(t) = \int_{x \in \mathbb{R}^d} b(x) e^{-||x-t||^2}dx$$ Define $$g^* = \sup_{t \in \mathbb{R}^d} g(t)$$ Then for any $t \in \mathbb{R}^d$ we get \begin{align} g(t) &= \sum_{i=1}^s w_i \int_{x \in \mathbb{R}^d} e^{-||x-t_i||^2} e^{-||x-t||^2} dx\\ &\overset{(a)}{=}\sum_{i=1}^s w_i \int_{x \in \mathbb{R}^d}e^{-\left[2||x - \frac{(t+t_i)}{2}||^2 + \frac{1}{2}||t-t_i||^2 \right]} dx\\ &= \sum_{i=1}^s w_i e^{-\frac{1}{2}||t-t_i||^2}\int_{x \in \mathbb{R}^d} e^{-2||x-\frac{(t+t_i)}{2}||^2}dx\\ &= D\sum_{i=1}^s w_i e^{-\frac{1}{2}||t-t_i||^2} \end{align} where step (a) uses our fact about norms. In particular: $$g^*\geq g(t_1) > D w_1$$

For each $r>0$ define $A_i(r)$ as the closed ball of radius $r$ about $t_i$: $$ A_i(r) = \{x \in \mathbb{R}^d : ||x-t_i||\leq r\}$$

Fix $\epsilon>0$ and suppose $\epsilon$ is sufficiently small to ensure that $Dw_1>\epsilon$ and $Dw_1> Dw_2+\epsilon$. Fix $r>0$ big enough to ensure $Dsw_1e^{-r^2/2}\leq \epsilon$. Suppose the center points $t_1, ..., t_s$ are far enough apart so that the sets $A_1(r), A_2(r), ..., A_s(r)$ are disjoint.

Case 1: Suppose $t \notin \cup_{i=1}^d A_i(r)$. Then $||t-t_i||\geq r$ for all $i$ and so \begin{align} g(t) &\leq D \sum_{i=1}^s w_i e^{-\frac{r^2}{2}} \\ &\leq D s w_1 e^{-\frac{r^2}{2}} \\ &\leq \epsilon \\ &< Dw_1 \end{align}
Case 2: Suppose $t \in A_k(r)$ for some $k \in \{2, ..., s\}$. Then $||t-t_j||\geq r$ for all $j \neq k$ and so $$ g(t) \leq Dw_k + D(s-1)w_1e^{-\frac{r^2}{2}} \leq Dw_2 + \epsilon <Dw_1$$

It follows that if $t \notin A_1(r)$ then $g(t)<Dw_1<g^*$. So the supremum value $g^*$ can only be approached over points in the compact set $A_1(r)$. Since $g$ is continuous, it follows that the supremum can be achieved, and any maximizer point $t^*$ must satisfy $t^* \in A_1(r)$. Now for any point $t \in A_1(r)$ we have $||t-t_j||\geq r$ for all $j \in \{2, ..., s\}$ and so \begin{align} g(t) &\leq Dw_1 e^{-\frac{||t-t_1||^2}{2}} + Dsw_1e^{-r^2/2} \\ &\leq Dw_1 - Dw_1(1- e^{-\frac{||t-t_1||^2}{2}}) + \epsilon\\ &< g^*- Dw_1(1- e^{-\frac{||t-t_1||^2}{2}}) + \epsilon \end{align} In particular, if $t^*$ is a maximizer, then $t^* \in A_1(r)$ and so $$ g^* = g(t^*) < g^* - Dw_1(1- e^{-\frac{||t^*-t_1||^2}{2}}) + \epsilon$$ In particular $$ Dw_1 (1- e^{-\frac{||t^*-t_1||^2}{2}}) < \epsilon $$ Define $\delta = \frac{\epsilon}{Dw_1}$ and note by our assumptions that $0<\delta < 1$. Define $z = \frac{||t^*-t_1||^2}{2}$. It follows that $e^{-z} >1-\delta$, so \begin{align} z &< -\log(1-\delta) \\ &= \log(\frac{1}{1-\delta})\\ &= \log(1 + \frac{\delta}{1-\delta})\\ &\leq \frac{\delta}{1-\delta} \end{align} In particular, if $t^*$ is a maximizer then $$ \frac{||t^*-t_1||^2}{2} \leq \frac{\delta}{1-\delta} = \frac{\epsilon}{Dw_1 - \epsilon}$$

Best Answer

Related Solutions

Compute the dual of an optimization problem defined on a function space

Why is this functional derivative equal to $0$

Related Question