Showing that the best approximating linear map for a Lipschitz function is also Lipschitz

functional-analysisharmonic-analysislipschitz-functionsreal-analysis

For $d,n \in \mathbb{N}$ with $1 \leq d<n$, let $f: \mathbb{R}^d \to \mathbb{R}^n$ be a Lipschitz map with some constant $L \geq 1$. Let $B=B(0,r)$ for some $r>0$ and define the quantity
$$ \Omega(B) = \inf_{A} \left( \frac{1}{|B|} \int_{B} \left( \frac{|f(y)-A(y)|}{r} \right)^2 dy \right)^{\frac{1}{2}}, $$

where the infimum is over all linear maps $A: \mathbb{R}^{d} \to \mathbb{R}^{n}$ and $|B|$ is just the $d$-Lebesgue measure of $B$.

Suppose that $\Omega(B) < \epsilon$ where $\epsilon$ is as small as we wish. I am trying to show that the infimizing map $A$ for $\Omega(B)$ is also Lipschitz with constant, say, $2L$. This intuitively seems to be true as $\Omega(B)$ being small means that $A$ must approximate an $L$-Lipschitz function $f$ very well inside $B$ and so should not deviate much from $f$. I suspect even without $\Omega(B) < \epsilon$ assumption, the best approximating map should still be $2L$-Lipschitz.

For example if we had defined $“L^{\infty}"$ version of this quantity by
$$\Omega_{\infty}(B) = \inf_{A} \frac{|| f-A ||_{L^{\infty}(B)}}{r}$$
then $ \Omega_{\infty}(B) \leq \frac{|| f-f(0) ||_{L^{\infty}(B)}}{r} \leq L$ so that for the map $A$ that realizes infimum in $\Omega_{\infty}(B)$, $|f(x)-A(x)| \leq L r$ for any $x \in B$ and from here it is not difficult to show that $A$ is $2L$-Lipschitz. I am having troubles with the map $A$ when defined for integral version $\Omega(B)$ though. Some help would be appreciated.

References: These quantities originate from Dorronsoro's paper.

Best Answer

Partial answer.

The quantity $\Omega(B)$ is $r^{-1}$ times the $L^2$ norm of $f-A$ associated to the uniform probability measure on the ball $B(0,r)$ (denoted by $\mu$).

I use the notation $$||f||_{\mathrm{Lip}} = \sup_{x \ne y }\frac{||f(y)-f(x)||}{||y-x||}$$ for the best Lipschitz constant.

On needs only to consider the case where $n=1$. Indeed, restriction, since minimizing $$||f-A||_2^2 = \sum_{i=1}^n ||f_i-A_i||_2^2$$ is equivalent to minimizing separately each $||f_i-A_i||_2^2$ for $1 \le i \le n$, and one has $$||f||_{\mathrm{Lip}}^2 = \sum_{i=1}^n ||f_i||_{\mathrm{Lip}}^2$$ and the same relation for $A$.

Call $(b_1,\ldots,b_d)$ the canonical basis in $\mathbf{R}^d$ and $(e_1,\ldots,e_d)$ its dual basis. Then $(e_1,\ldots,e_d)$ is an orthogonal family in $L^2(\mu)$ and a basis of the subspace of all linear functions. All vectors of this basis have the same norm, and $$||e_1||_2^2+\ldots+||e_d||_2^2 = |B|^{-1} \int_B (x_1^2+\ldots+x_d^2) \mathrm{d x} = (v_d r^d)^{-1} \int_0^r r^2 dv_dr^{d-1} \mathrm{d}r = \frac{r^2}{d+2}.$$

The linear map $A$ which minimizes $||f-A||_2^2$ is the orthogonal projection of $f$ on the subspace of all linear functions. Hence $$A = \sum_{j=1}^d \frac{\langle f,e_j \rangle}{\langle e_j,e_j \rangle} e_j.$$ Given $x$ and $y$ in $B$, $$(A(y)-A(x))^2 = \sum_{j=1}^d \Big(\frac{\langle f,e_j \rangle}{\langle e_j,e_j \rangle} (y_j-x_j) \Big)^2.$$ By Cauchy-Schwarz inequality, $$(A(y)-A(x))^2 \le \sum_{j=1}^d \Big(\frac{\langle f,e_j \rangle}{\langle e_j,e_j \rangle}\Big)^2 ||y-x||^2 = \frac{||A||_2^2}{||e_1||_2^2} \times ||y-x||^2.$$ Since the linear maps have null average on $B$ (by imparity), $A$ is also the orthogonal projection of $f-E(f)$, where $E(f)$ denotes the mean value (expectation) of $f$ with regard to $\mu$. Hence $$|A(y)-A(x)| \le \frac{||f-E(f)||_2}{||e_1||_2} ||y-x||.$$

Now, what we remains to be proven is that $||f-E(f)||_2 \le L||e_1||_2$ when $f$ is $L$-Lipschitz. In other words, the functions $L\langle u,\cdot\rangle$ where $u$ is a unit vector minimize the norm in $L^2(\mu)$ among all $L$-Lispchitz functions with null average on $B$. Here is an intuitive reason: using Fubini's theorem, one sees that $$2||f-E(f)||_2^2 = \int_B \int_B \big(f(y)-f(x)\big)^2 \mathrm{d}\mu(x)\mathrm{d}\mu(y).$$ One may assume that $f$ is $\mathcal{C}^1$. Then, the way to maximize this quantity under the constraint that $f$ is $L$-Lispchitz is that the gradient of $f$ is constant.

Related Question