Information propagation rate in 2nd-order PDEs

elliptic-equationshyperbolic-equationsparabolic pdepartial differential equations

(Apologies if some details are wrong, this is my first time passing through this stuff, and would appreciate any feedback on correcting the problem statement as well as getting the solution.)

Three common characteristics of 2nd order PDEs are elliptic, hyperbolic, and parabolic. For systems with many variables, I am going to describe them as
$$
0 = \underbrace{\left(\frac{1}{2}x^TH x+x^Tg + c\right)}_{F} u
$$

where $x = (x_1,…,x_n)$ and $x_i$ stands in place for $\partial f/\partial u_i$, $H$ is a symmetric matrix, and $g$ is a vector. The elements $H_{ij}$ and $g_i$ may be constants or functions of $u$. Let us assume they are constants for now. I am writing $F$ as an operator on $u$.

Then, according to Strauss 1.6, the system is

  • elliptic if $H$ is nonsingular and all the eigenvalues have the same sign
  • hyperbolic if $H$ is nonsingular, with some eigenvalues positive and some negative
  • parabolic if $H$ is singular, with exactly one 0 eigenvalue.

(My guess is that singular $H$ with more than 1 eigenvalue that is 0 can be reduced to parabolic with changes of variables.)

First question: is that interpretation correct?

I am trying to stare at these equations and use them to make sense of the following statement: "hyperbolic systems have finite information propagation, and parabolic/elliptic systems have infinite information propagation rate". I wish to make this concept precise.

  1. What exactly is meant by "information propagation"? Without necessarily including a time variable, can I assume that this is some bound on $\max_i|\partial u_i/\partial u_j|$?
  2. Why would some systems result in finite information propagation rate and others necessarily have infinite information propagation rate?

I understand how wave propagation works in the specific wave equation case, but 1) I'm not super clear on why diffusion works out to be infinite propagation rate and 2) I'm not sure how this relates to the known properties of the different PDE classes.

Any clarification or discussion would be appreciated!

Best Answer

By your notation I will assume you are looking at operators of the form $$ F(D)u = \frac12\sum_{i,j=1}^n H_{ij}\partial_{x_ix_j}u + \sum_{i=1}^n g_i\partial_{x_i}u + cu = 0. $$

First question: is that interpretation correct?

The classification given in Strauss is for PDEs in two dimensions only; in higher dimensions you still have a classification, but it's no longer exhaustive. Also you implicitly assume $H \neq 0$ so the system is genuinely of second order, which is why the parabolic case assumes exactly one eigenvalue is zero.

In general dimensions this is somewhat convention-dependent, but you have the following classification:

  • The equation is elliptic if $H$ is non-singular and all eigenvalues have the same sign.
  • The equation is hyperbolic if $H$ is non-singular and all but one eigenvalue has the same sign.
  • The equation is parabolic if $H$ has one zero eigenvalue, and all other eigenvalues have the same sign. This is somewhat convention-dependent however, and one usually assumes extra structure in this case.

Note this is not a complete classification; if $H$ is non-singular but neither elliptic nor hyperbolic, then it is sometimes referred to as ultrahyperbolic - however this is limited literature on equations of these type.

Generally you should think of the model equations: Laplace's equation, the heat equation and the wave equation. The classification also extends more generally to nonlinear equations, etc, and while the convention varies this is generally based on how solutions to these equations behave.

For the parabolic and hyperbolic case, if $H$ is constant we see we have a distinguished direction $v$ - in the parabolic case this corresponds to the zero eigenvector, while in the hyperbolic case this corresponds to the eigenvector whose eigenvalue has different sign. By a change of variables we can assume $v = e_n,$ and that writing $x_n=t$ the equation takes the form $$ g_n \partial_{t}u + \frac12\sum_{i,j=1}^{n-1} H_{ij}\partial_{x_ix_j}u + \sum_{i=1}^{n-1} g_i \partial_{x_i} u + cu = 0 $$ in the parabolic case, and $$ \frac12H_{nn}\partial_{t}^2u + \frac12\sum_{i,j=1}^{n-1} H_{ij}\partial_{x_ix_j}u + \sum_{i=1}^{n-1} g_i \partial_{x_i} u +g_n\partial_tu+ cu = 0 $$ in the hyperbolic case (I am omitting the details here, but some linear algebra is involved). Therefore we see we naturally obtain a distinguished direction, which we suggestively denote as $t$ - which was completely determined by the equation! Note in the parabolic case we usually assume $g_n \neq 0,$ as otherwise the equation reduces to an elliptic equation in the first $(n-1)$ variables, and in the hyperbolic case $H_{nn} < 0$ necessarily because $H$ is non-degenerate (assuming the other eigenvalues are positive).

Generally when working with parabolic and hyperbolic equations we assume there is a distinguished direction $\partial_t,$ even for more general variable coefficient and nonlinear equations${}^\ast$ - this is because in the applications we are interested in there is a natural time direction, and it is more convenient for purposes of analysis.

${}^\ast$Note the classification is also used to distinguish nonlinear PDEs, but here there is no standard way of doing this. Typically we require a suitable linearised equation to be elliptic/parabolic/hyperbolic, but what this means depends on the particular application. Also in this case we don't always have a distinguished time direction, which notably arises in the context of general relativity.

Now that I've explained how a time direction naturally arises in the parabolic and hyperbolic cases, I can turn to your question about propagation of information.

What exactly is meant by "information propagation"? Without necessarily including a time variable, can I assume that this is some bound on $\max_i|\partial u_i/\partial u_j|$?

Propagation of information generally refers to how local changes in the initial/boundary data is reflected in the corresponding solution. For this however, note that the type of boundary conditions varies depending on the equations we consider.

  • For elliptic equations we consider boundary value problems, where for a bounded domain $\Omega \subset \Bbb R^n$ we seek solutions $u$ subject to conditions on the boundary; e.g. $u = g$ on $\partial\Omega$ (Dirichlet) or $\partial_{\nu}u = h$ on $\partial\Omega$ where $\partial_{\nu}$ is the normal derivative (Neumann).

  • For parabolic and hyperbolic problems we consider initial boundary value problems, typically on domains of the form $\Omega' \times (0,T)$ where $\Omega' \subset \Bbb R^{n-1}$ is a bounded domain. In both cases we prescribe $u$ on $\partial\Omega' \times (0,T)$ (the boundary part) and in the parabolic case $u$ on $\Omega' \times \{0\}$ also (the initial part). For hyperbolic problems we need to prescribe initial values for both $u$ and $\partial_tu$ at $t=0.$

The reason we impose these conditions is because it turns out this is necessary to ensure the equation is well-posed; if you prescribe suitably regular initial/boundary data, then you can show a unique solution exists. There are many examples that show well-posedness fails if you don't impose the correct conditions, and for instance this explains why there is no time direction if your equation is elliptic.

In the elliptic and parabolic cases, local changes to initial/boundary data can change the solution at every point, which is what we mean by infinite speed of propagation. For wave equations this is not the case however, and changes to initial data is propagates along a 'wave cone.'

The details of this are lengthy, so I refer you to Chapter 2 of Strauss' text which discusses these phenomena in the context of the wave and diffusion (heat) equations.

Why would some systems result in finite information propagation rate and others necessarily have infinite information propagation rate?

This is a rather deep question, but as I've alluded to in the above discussion the behaviour of solutions to PDEs greatly varies depending on their classification. This is why for instance there is no unified theory of PDEs - small changes to the equation can result is very different behavior, which is to be expected because different equations model different physical phenomena.

Related Question