How does this then imply that the solution is $u = f(y^\prime) = f(bx – ay)$

multivariable-calculuspartial derivativevector analysis

I am currently studying the textbook Partial Differential Equations: An Introduction, second edition, by Walter A. Strauss. Chapter 1.2 First-Order Linear Equations says the following:

Let us solve

$$au_x + bu_y = 0,$$

where $a$ and $b$ are constants not both zero.

Geometric Method The quantity $au_x + bu_y$ is the directional derivative of $u$ in the direction of the vector $\mathbf{V} = (a, b) = a \mathbf{i} + b \mathbf{j}$. It must always be zero. This means that $u(x, y)$ must be constant in the direction of $\mathbf{V}$. The vector $(b, -a)$ is orthogonal to $\mathbf{V}$. The lines parallel to $\mathbf{V}$ (see Figure 1) have the equations $bx – ay = \text{constant}$. (They are called the characteristic lines.) The solution is constant on each such line. Therefore, $u(x, y)$ depends on $bx – ay$ only. Thus the solution is
$$u(x, y) = f(bx – ay), \tag{2}$$
where $f$ is any function of one variable.
enter image description here
Let's explain this conclusion more explicitly. On the line $bx – ay = c$, the solution $u$ has a constant value. Call this value $f(c)$. Then $u(x, y) = f(c) = f(bx – ay)$. Since $c$ is arbitrary, we have formula (2) for all values of $x$ and $y$. In $xyu$ space the solution defines a surface that is made up of parallel horizontal straight lines like a sheet of corrugated iron.

Coordinate Method Chang variables (or "make a change of coordinates"; Figure 2) to

$$x^\prime = ax + by \ \ \ \ \ \ \ \ \ \ y^\prime = bx – ay. \tag{3}$$

Replace all $x$ and $y$ derivatives by $x^\prime$ and $y^\prime$ derivatives. by the chain rule,

$$u_x = \dfrac{\partial{u}}{\partial{x}} = \dfrac{\partial{u}}{\partial{x^\prime}} \dfrac{\partial{x^\prime}}{\partial{x}} + \dfrac{\partial{u}}{\partial{y^\prime}} \dfrac{\partial{y^\prime}}{\partial{x}} = au_{x^\prime} + bu_{y^\prime}$$

$$u_y = \dfrac{\partial{u}}{\partial{y}} = \dfrac{\partial{u}}{\partial{y^\prime}} \dfrac{\partial{y^\prime}}{\partial{y}} + \dfrac{\partial{u}}{\partial{x^\prime}} \dfrac{\partial{x^\prime}}{\partial{y}} = bu_{x^\prime} – au_{y^\prime}.$$

Hence $au_x + bu_y = a(au_{x^\prime} + bu_{y^\prime}) + b(bu_{x^\prime} – au_{y^\prime}) = (a^2 + b^2)u_{x^\prime}$. So, since $a^2 + b^2 \not= 0$, the equation takes the form $u_{x^\prime} = 0$ in the new (primed) variables. Thus the solution is $u = f(y^\prime) = f(bx – ay)$, with $f$ an arbitrary function of one variable.
enter image description here

I understand that $au_x + bu_y = (a^2 + b^2)u_{x^\prime} = 0$, and since $a^2 + b^2 \not= 0$, we have that $u_{x^\prime} = 0$. But how does this then imply that the solution is $u = f(y^\prime) = f(bx – ay)$?

Best Answer

After the coordinate transformation, we know that $u=f(x',y')$. But since $u_{x'}=0$, we get that for a fixed $y'$ and variable $x'$, $f(x',y')$ is a constant. In other words, the value of $f(x',y')$ depends only on $y'$. With an abuse of notation, we can thus write $f(x',y')=f(y')=f(bx-ay)$.

If we wanted to be a bit more clear on the notation, we could choose an arbitrary $x'$, say $x'=0$, and then define the function $\tilde f(y')=f(0,y')$. Since $f$ was an arbitrary function of $(x',y')$, we get $\tilde f$ is an arbitrary function of $y'$, that is, $\tilde f$ is an arbitrary function of $bx-ay$.

Related Question