It is stated in section 10 of this paper that the usual Sobolev embedding $$W^{1,1}(\mathbb{R}^n) \subset L^{n/(n-1)}(\mathbb{R}^n)$$ can be interpreted in geometrical terms as an isoperimetric statement. Although the authors said that this is well-known, I can not find such a statement in Evans classical book on PDEs. Can anyone elaborate the details here? That is, how can we derive an isoperimetric-type inequality from such a Sobolev embedding?
Geometric implication of the Sobolev embedding
functional-analysisgeometric-functional-analysispartial differential equationssobolev-spaces
Related Solutions
To motivate Sobolev spaces, let me pose a motivating problem.
Let $\Omega$ be a smooth, bounded domain in ${\Bbb R}^n$ and let $f$ be a $C^\infty$ function on $\Omega$. Prove that there exists a $C^2$ function $u$ satisfying $-\Delta u = f$ in $\Omega$ and $u = 0$ on the boundary of $\Omega$.
As far as PDE's go, this is the tamest of the tame: it's a second-order, constant coefficient elliptic PDE with a smooth right-hand side and a smooth boundary. Should be easy right? It certainly can be done, but you'll find it's harder than you might think.
Imagine replacing the PDE with something more complicated like $-\text{div}(A(x)\nabla u) = f$ for some $C^1$ uniformly positive definite matrix-valued function $A$. Proving even existence of solutions is a nightmare. Such PDE's come up all the time in the natural sciences, for instance representing the equillibrium distribution of heat (or stress, concentration of impurities,...) in a inhomogenous, anisotropic medium.
Proving the existence of weak solutions to such PDE's in Sobolev spaces is incredibly simple: once all the relevant theoretical machinery has been worked out, the existence, uniqueness, and other useful things about the solutions to the PDE can be proven in only a couple of lines. The reason Sobolev spaces are so effective for PDEs is that Sobolev spaces are Banach spaces, and thus the powerful tools of functional analysis can be brought to bear. In particular, the existence of weak solutions to many elliptic PDE follows directly from the Lax-Milgram theorem.
So what is a weak solution to a PDE? In simple terms, you take the PDE and multiply by a suitably chosen${}^*$ test function and integrate over the domain. For my problem, for instance, a weak formulation would be to say that $-\int_\Omega v\Delta u \, dx = \int_\Omega fv \, dx$ for all $C^\infty_0$ functions $v$. We often want to use integration by parts to simplify our weak formulation so that the order of the highest derivative appearing in the expression goes down: you can check that in fact $\int_\Omega \nabla v\cdot \nabla u \, dx = \int_\Omega fv \, dx$ for all $C^\infty_0$ functions $v$.
Note the logic. You begin with a smooth solution to your PDE, which a priori may or may not exist. You then derive from the PDE a certain integral equation which is guaranteed to hold for all suitable test functions $v$. You then define $u$ to be a weak solution of the PDE if the integral equation holds for all test functions $v$.
By construction, every classical solution to the PDE is a weak solution. Conversely, you can show that if $u$ is a $C^2$ weak solution, then $u$ is a classical solution.${}^\dagger$ Showing the existence of solutions in a Sobolev space is easy, but proving that they have enough regularity (that is, they are continuous differentiable up to some order—$2$, in our case) to be classical solutions often requires very length and technical proofs.${}^\$$
(The Sobolev embedding theorems you mention in your post are one of the key tools--they establish that if you have enough weak derivatives in a Sobolev sense, then you also are guaranteed to have a certain number of classical derivatives. The downside is you have to work in a Sobolev space $W^{k,p}$ where $p$ is larger than the dimension of the space, $n$. This is a major bummer since we like to work in $W^{k,2}$ since it is a Hilbert space, and thus has much nicer functional analytic tools. Alternatively, if you show that your function is in $W^{k,2}$ for every $k$, then it is guaranteed to lie in $C^\infty$.)
All of what I've written kind of dances around the central question of why Sobolev spaces are so useful and why all of these functional analytic tools work for Sobolev spaces but not for spaces like $C^2$. In a sentence, completeness is really, really important. Often, in analysis, when we want to show a solution to something exists, it's much easier to construct a bunch of approximate solutions and then show those approximations converge to a bona fide solution. But without completeness, there might not be a solution (a priori, at least) for them to converge to. As a much simpler example, think of the intermediate value theorem. $f(x) = x^2-2$ has $f(2) = 2$ and $f(0) = -2$, so there must exist a zero (namely $\sqrt{2}$) in $(0,2)$. This conclusion fails over the rationals however, since the rationals are not complete, $\sqrt{2} \notin {\Bbb Q}$. In fact, one way to define the Sobolev spaces is as the completion of $C^\infty$ (or $C^k$ for $k$ large enough) under the Sobolev norms.${}^\%$
I have not the space in this to answer your questions (1) and (2) directly, as answering these questions in detail really requires spinning out a whole theory. Most graduate textbooks on PDEs should have answers with all the details spelled out. (Evans is the standard reference, although he doesn't include potential theory so he doesn't answer (1), directly at least.) Hopefully this answer at least motivates why Sobolev spaces are the "appropriate space to look for solutions to PDEs".
${}^*$ Depending on the boundary conditions of the PDE's, our test functions may need to be zero on the boundary or not. Additionally, to make the functional analysis nice, we often want our test functions to be taken from the same Sobolev space as we seek solutions in. This usually poses no problem as we may begin by taking our test functions to be $C^\infty$ and use certain approximation arguments to extend to all functions in a suitable Sobolev space.
${}^\dagger$ Apply integration by parts to recover $-\int_\Omega v\Delta u \, dx = \int_\Omega fv \, dx$ for all $C^\infty_0$ functions $v$ and apply the fundamental lemma of calculus of variations.
${}^\$$ Take a look at a regularity proof for elliptic equations in your advanced PDE book of choice.
${}^\%$ You might ask why complete in Sobolev norm, not some simpler norm like $L^p$? Unfortunately, the $L^p$ completion of $C^\infty$ is $L^p$, and there are functions in $L^p$ which you can't define any sensible weak or strong derivative of. Thus, in order to define a complete normed space of differentiable functions, the derivative has to enter the norm (which is why the Sobolev norms are important, and in some sense natural.)
Let's not worry about the optimal geometric conditions on $U$ and just assume throughout that $U$ is smooth.
The trace operator, the usual one, maps $H^1(U)\to L^2(U)$. This means that if $u\in H^2(U)$, then we can apply it to both $u$ and its first derivatives. In particular if $u\in H^2(U)$ then we can make sense of $u|_{\partial U}$ and $\nabla u|_{\partial U}$ (although at first they have no derivative relation on $\partial U$, but see point 2 below). This suggests that we define $$ \partial_{\nu} u(x):= \nabla u(x) \cdot \nu(x) , \qquad x\in \partial U, $$ as an element of $L^2(\partial U)$.
Now the intuition behind the equivalence you mention is that if we define $\nabla_{\|}u:= \nabla u-\partial_{\nu}u \nu$ as the projection of $\nabla u$ onto the orthogonal complement of $\nu$, then $\nabla_{\|}$ (at least in the case $u$ is smooth) consists of the "tangential" derivatives to $\partial U$; i.e. $\nabla_{\|}$ is only seeing the behavior of $u$ on $\partial U$. What this does for us is the following: If $u=0$ on $\partial U$, then $\nabla_{\|}u=0$ and so in this case, by definition, we have $\nabla u= \partial_{\nu} u\nu$; or put another way: with the assumption that $u=0$, the vanishing of the full gradient (on $\partial U$) is equivalent to vanishing of the normal derivative.
With the above intuition in place, the proof of this has two main points.
Prove that $H^2_0(U)=\{ u\in H^2(U): u=0, \, \nabla u=0\, \text{ on } \partial U\}$. This is done as in Evans's result for $H^1_0$, with the important fact being that the vanishing of $u$ and $\nabla u$ at the boundary guarantees that the extension of $u$ by 0 outside of $\bar{U}$ defines an element of $H^2(\mathbb{R}^n)$.
Make rigorous the previous discussion about normal and tangential derivatives. This is kind of tedious by the nature of traces of Sobolev functions, but the main statement is that the trace of $u$ and it's gradient are actually related in the natural way: $\nabla_{\|} u$ is actually the weak derivative of $u|_{\partial U}$ on the boundary. More precisely the trace map defines a bounded map $H^2(U)\to H^1(\partial U)$, with the above formula for the "boundary gradient" of the trace (of course we have to define $H^1(\partial U)$, this is done via coordinate patches and straightening the boundary).
Hope this helps.
Best Answer
$\newcommand{\R}{\mathbb{R}}$ We will use the Sobolev embedding $W^{1,1}(\R^n) \subseteq L^{n/(n-1)}(\R^n)$ in the form $$ \| u \|_{\frac{n}{n-1}} \le C \| D u \|_1 \qquad \text{for } u \in W^{1,1}(\R^n). $$ This is probably what any proof gives you. If you only have $\| u \|_{\frac{n}{n-1}} \le C (\| u \|_1 + \| D u \|_1)$, you can use arbitrage to get rid of $\| u \|_1$ on the RHS${}^1$. $\newcommand{\eps}{\varepsilon}$
The next step is generalizing the inequality to $BV(\R^n)$, the space of functions of bounded variation${}^2$. This is similar to $W^{1,1}(\R^n)$, but instead of $u,Du \in L^1(\R^n)$ we require $u \in L^1(\R^n)$ and $Du \in M(\R^n)$ - distributional partial derivatives need to be representable by finite signed measures on $\R^n$. For each such $u$, one can take the approximation by convolution $u_\eps := u * \varphi_\eps$. It should be clear that $u_\eps \to u$ in $L^1(\R^n)$, and moreover $\| Du_\eps \|_1 \le \| Du \|_{M}$, where $M$ stands for the total variation norm of a (vector valued) measure. By Sobolev embedding, $$ \| u_\eps \|_{\frac{n}{n-1}} \le C \| D u_\eps \|_1 \le C \| D u \|_M $$ for each $\eps$. In $L^{\frac{n}{n-1}}(\R^N)$ one can take a weakly convergent subsequence, whose limit has to be $u$ (thank to $L^1$ convergence). In consequence, $$ \| u \|_{\frac{n}{n-1}} \le \liminf_{\eps \to 0} \| u_\eps \|_{\frac{n}{n-1}} \le C \| D u \|_M. $$
Finally, let us look at the geometric meaning of this. Consider $u$ to be a characteristic function of some set: $\chi_A$. If $A \subseteq \R^n$ is a bounded smooth set, the divergence formula $$ \int_A \operatorname{div} \varphi(x) \, dx = \int_{\partial A} \varphi(x) \cdot \vec{n}(x) \, d \mathcal{H}^{n-1}(x) $$ can be interpreted as integration by parts. In other words, it tells us that the distributional differential of $\chi_A$ is $$ D \chi_A = \vec{n} \mathcal{H}^{n-1} \llcorner \partial A, $$ the outer normal vector field on the boundary (with surface measure on the boundary). In this case, the total variation norm is $$ \| D \chi_A \|_M = \| \mathcal{H}^{n-1} \llcorner \partial A \|_M = \mathcal{H}^{n-1} (\partial A), $$ while the $L^{\frac{n}{n-1}}$ norm of $\chi_A$ is simply $(\mathcal{H}^n(A))^{\frac{n-1}{n}}$. Hence, the Sobolev embedding gives us $$ (\mathcal{H}^n(A))^{\frac{n-1}{n}} \le C \mathcal{H}^{n-1} (\partial A). $$
${}^1$ The same trick also shows that $\frac{n}{n-1}$ is the only possible exponent on the LHS.
${}^2$ The name comes from the 1-dimensional case. It turns out $u \in BV([0,1])$ if and only if it has a representative for which the variation $$ \sup \left\{ \sum_{k=1}^n |u(t_k)-u(t_{k-1})| : 0 = t_0 < \ldots < t_n = 1 \right\} $$ is finite.