[Math] How is the Laplace Transform a Change of basis

change-of-basiseigenfunctionseigenvalues-eigenvectorslaplace transformlinear algebra

This question is primarily based on the following answer's way of reasoning,

https://math.stackexchange.com/a/2156002/525644

If you want to write a new answer to the question; "How is the Laplace Transform a Change of basis?" Please do.

In jnez71's answer, he concludes that laplace transform is a change of basis,


Now lets look at that mysterious laplace transform.
$$\mathscr{L}(f(x)) = \int_{-\infty}^\infty e^{-sx}f(x) \, dx$$

Imagine all possible values of $e^{-sx}$ in a big matrix $^1$, where each
row corresponds to plugging in a specific $s$ and each column
corresponds to plugging in a specific $x$. (This matrix is orthonormal
if $s=i\omega$, i.e. the Fourier transform). If you select some $s$,
you are plucking out a specific value of the function that resulted
from the multiplication of this matrix with the vector $f(x)$, a
function we call $F(s):=\mathscr{L}(f(x))$. Specifically, $$F(s=3) =
> f(x) \cdot e^{-3x}$$

(where that dot is an inner product, not ordinary multiplication). We
say that $F(s)$ is just $f(x)$ expressed on a basis of exponential
functions$^2$. Choosing a specific value of $s=s_1$ is picking out the
value of $f(x)$ in the $e^{-s_1x}$ direction. The entire $e^{-sx}$
can be viewed as the change of basis matrix.

Wait, what basis were we on before if we're on the exponentials now?
The dirac
deltas
. Take an
inner product of some function with a dirac delta and notice how you
get back that function at the action point of the dirac delta$^3$. This is
sometimes called the sifting
theorem
, but it
should be clear that if we can project a vector (via inner product)
and just get back some component of that vector, that component was how much the vector had in the direction we projected it onto.


Questions:

1. Can someone literally write the matrix he's referring to?

2. Please elucidate on "how" and where exactly does this change of basis occurs?(even though this information exists in his answer I couldn't get a good grasp)

3.

Take an
inner product of some function with a dirac delta and notice how you
get back that function at the action point of the dirac delta

How does this imply that previously we were on the dirac delta function's basis?

Best Answer

(Apparently Stack Exchange doesn't notify me for being mentioned in a question! Well, r/3b1b luck has brought me here. Hopefully I can help clear this up!)

It isn't physically possible to write down an uncountably infinite-sized matrix, but what I can do is start with the discrete case and encourage you to "take the limit" in your mind.

Suppose we have a real-valued, discrete-time function with compact support $f : \{0, 1, 2, 3\} \to \mathbb{R}$. Here is a plot of that function,

plot of our discrete-time function

and here are its precise values, $$ f := \begin{bmatrix}1 \\ 1.3 \\ 0 \\ 1\end{bmatrix} $$

We can have another function $g : \{0, 1, 2, 3\} \to \mathbb{R}$ from the same function space and see that we can naturally define addition, $$ f + g := \begin{bmatrix}f(0)+g(0) \\ f(1)+g(1) \\ f(2)+g(2) \\ f(3)+g(3)\end{bmatrix} $$

scaling, $$ af := \begin{bmatrix}af(0) \\ af(1) \\ af(2) \\ af(3)\end{bmatrix} $$

and even an inner-product, $$ \langle f, g \rangle := f(0)g(0) + f(1)g(1) + f(2)g(2) + f(3)g(3) $$

Our function space is really a vector space with 4 dimensions (the $t=0$ dimension, the $t=1$ dimension, the $t=2$ dimension, and the $t=3$ dimension).

Since $f$ is a vector, we can talk about change-of-basis. Specifically, we can express $f$ as a weighted sum (linear combination) of some other vectors, and then use those weights as our new expression for $f$. For example, the following are equivalent: \begin{align} f &= \color{red}{1}\begin{bmatrix}1 \\ 0 \\ 0 \\ 0\end{bmatrix} + \color{red}{1.3}\begin{bmatrix}0 \\ 1 \\ 0 \\ 0\end{bmatrix} + \color{red}{0}\begin{bmatrix}0 \\ 0 \\ 1 \\ 0\end{bmatrix} + \color{red}{1}\begin{bmatrix}0 \\ 0 \\ 0 \\ 1\end{bmatrix}\\ \\ &= \color{blue}{0.5}\begin{bmatrix}2 \\ 2 \\ 0 \\ 0\end{bmatrix} + \color{blue}{-0.3}\begin{bmatrix}0 \\ -1 \\ 0 \\ 0\end{bmatrix} + \color{blue}{0}\begin{bmatrix}0 \\ 0 \\ 3 \\ 0\end{bmatrix} + \color{blue}{-i}\begin{bmatrix}0 \\ 0 \\ 0 \\ i\end{bmatrix}\\ \end{align}

If we call the first expansion's collection of vectors the "$t$"-basis then we see that the expression of $f$ as $\color{red}{\begin{bmatrix}1 & 1.3 & 0 & 1\end{bmatrix}^\intercal}$ is just how $f$ "looks" when thinking in terms of $t$. If we call the second expansion's collection of vectors the "$b$"-basis then, thinking in terms of $b$, the expression for $f$ is $\color{blue}{\begin{bmatrix}0.5 & -0.3 & 0 & -i\end{bmatrix}^\intercal}$.

Neither of these expressions is more "correct" than the other, but depending on the context, one can be more useful (say if your problem involves operators that have a special relationship with the $b$-basis). If we are still thinking of $f$ as a function, then we say that expressing $f$ in terms of the $t$ basis is "$f$ as a function of $t$" (since $f$ "assigns" a value to each of the $t$ basis vectors). Likewise, $\begin{bmatrix}0.5 & -0.3 & 0 & -i\end{bmatrix}^\intercal$ would be the values that $f$ assigns to the $b$ basis vectors, i.e. $f(b=0)=0.5$, $f(b=1)=-0.3$, etc...

We are interested in a specific basis known as the "Fourier" basis. The Fourier basis can be defined for a vector space of any (including infinite) dimension, but here it is for our case of 4 dimensions: $$ \Omega_4 := \Big{\{} \begin{bmatrix}1 \\ 1 \\ 1 \\ 1\end{bmatrix}, \begin{bmatrix}1 \\ i \\ -1 \\ -i\end{bmatrix}, \begin{bmatrix}1 \\ -1 \\ 1 \\ -1\end{bmatrix}, \begin{bmatrix}1 \\ -i \\ -1 \\ i\end{bmatrix}\Big{\}} $$

(Note I have left out a scale factor of $\frac{1}{\sqrt{4}}$ for notational clarity).

As with any new basis, our $f$ in the original basis can be expressed in the Fourier basis by solving, \begin{align} F_{t\omega}f_\omega &= f_t\\ \\ \begin{bmatrix} 1 & 1 & 1 & 1 \\ 1 & i & -1& -i \\ 1 & -1 & 1 & -1 \\ 1 & -i & -1 & i\end{bmatrix} \begin{bmatrix} f_\omega(0)\\ f_\omega(1)\\ f_\omega(2)\\ f_\omega(3)\end{bmatrix} &= \begin{bmatrix} 1 \\ 1.3 \\ 0 \\ 1 \end{bmatrix}\\ \\ f_\omega(0)\begin{bmatrix}1 \\ 1 \\ 1 \\ 1\end{bmatrix}+ f_\omega(1)\begin{bmatrix}1 \\ i \\ -1 \\ -i\end{bmatrix}+ f_\omega(2)\begin{bmatrix}1 \\ -1 \\ 1 \\ -1\end{bmatrix}+ f_\omega(3)\begin{bmatrix}1 \\ -i \\ -1 \\ i\end{bmatrix} &= \begin{bmatrix} 1 \\ 1.3 \\ 0 \\ 1 \end{bmatrix} \end{align}

where $f_t$ is $f$ expressed in terms of the $t$-basis, $f_\omega$ is the same vector $f$ expressed in terms of the (Fourier) $\omega$-basis, and $F_{t\omega}$ is the change-of-basis matrix who's columns are the Fourier basis vectors. Taking a "Fourier transform" means solving this equation for $f_\omega$ given $f_t$. Special properties of the Fourier basis, namely orthogonality, make inverting this matrix as easy as taking a complex conjugate, leaving just a matrix multiplication to be done. Further, special symmetry properties of this matrix allow for an even speedier multiplication known as the "fast Fourier transform" algorithm, "FFT".

So with these definitions, it is clear that the Fourier transform is just a change-of-basis, and in a finite-dimensional setting, can be explicitly written with matrix operations. What is left is to write the Fourier basis in terms of an $n$-dimensional space and then try to imagine what this matrix looks like as $n \to \infty$ (for when $f : \mathbb{R} \to \mathbb{R}$). The general expression is that the $t$th-row and $\omega$th column of the $n$-dimensional matrix $F_{t\omega}$ has value $\frac{1}{\sqrt{n}}e^{i t\omega/n}$. Holding $\omega$ fixed means looking at all the rows down a specific column, and we see that those values are a discretized complex sinusoid on $t$. Graphically: DFT COB Matrix Concept

This Wikipedia article provides more detail (make sure to read the last section about the infinite resolution limit), albeit with different notation than what I have used here (sorry!). Especially, the article defines the (Discrete)FT matrix to be the inverse (conjugate transpose) of my $F_{t\omega}$. That is, \begin{align} F_{t\omega}f_\omega &= f_t\ \ \ \ \ \ \ \ \ \ \ \text{my convention emphasizing change-of-basis}\\ f_\omega &= F_{\omega t}f_t\ \ \ \ \ \ \ \ \text{typical convention emphasizing algorithm implementation}\\ F_{t\omega} &= \text{conj}(F_{\omega t})\ \ \ \ \text{simple inverse due to orthogonality and symmetry} \end{align}

Okay okay, so I haven't mentioned the Laplace transform yet. Well, conceptually it's the exact same thing but the basis functions aren't restricted to only real $\omega$ values. So in the expression $e^{it\omega}$ we replace $\omega$ by $s$ because $s$ sounds like a good letter for representing complex numbers and then "absorb" the $i$ into that $s$, leaving $e^{ts}$. In order to discretize $e^{ts}$ to express it as a matrix, we will need a way to index the $s$-columns with complex numbers, and it can be confusing to think about what it means to grab the $3i+2$th column of a matrix (you need more indices).

I'm not going to bother doing that here because as a famous mathematician once said (paraphrase) "linear algebra is a trivial subject made difficult by unnecessary use of matrices." Any invertible linear operator can be conceptualized as a change-of-basis. The Laplace transform is an invertible linear operator on the vector space of $L^2$ functions. QED. The Laplace transform tends to be a bit gnarlier than its cross-section, the Fourier transform, because the Laplace basis as a whole is not orthogonal.

Alright so what's up with the Dirac deltas? What it means to "choose a basis" is to name a certain collection of vectors as all zeros with a single 1 in the kth row. When you are in a particular basis and thus writing matrix expressions like $\begin{bmatrix} 2 & 3 \end{bmatrix}^\intercal$ you're really taking the perspective that your vector is $2$ in the $\begin{bmatrix} 1 & 0 \end{bmatrix}^\intercal$ direction and 3 in the $\begin{bmatrix} 0 & 1 \end{bmatrix}^\intercal$ direction. In the finite-dimensional function setting, these basis vectors are the Kronecker-delta functions. In the infinite-dimensional "distributional" setting, they are the Dirac-deltas. To be more precise, the actual property is "sifting": an inner-product of the vector $f$ with the $\tau$-th standard basis vector returns $f$ evaluated at $\tau$ (i.e. the $\tau$-th coefficient). (This idea might help you understand "Greens functions" later on). \begin{gather} \langle \delta_\tau, f \rangle = f(\tau) \tag{generally}\\[6pt] \langle \delta_0, f \rangle = \begin{bmatrix}1 \\ 0 \\ 0 \\ 0\end{bmatrix} \cdot \begin{bmatrix}2 \\ 9.1 \\ i \\ 1+i\end{bmatrix} = 2 = f(0) \tag{example} \end{gather}

Last but not least, why are these bases so special anyway? Well, they are actually the eigenvectors of the derivative operator (derivative of an exponential is a scaling of that exponential!), so when your problem involves derivatives, expressing your functions on the Laplace or Fourier basis can simplify things to scalar operations. Consider possible discrete representations of the derivative operator for our 4-dimensional function space, namely this first-order finite-difference with periodic boundary-condition, who's eigenbasis is the Fourier basis: \begin{gather} \frac{d}{dt} = \begin{bmatrix} 1 & 0 & 0 & -1 \\ -1 & 1 & 0 & 0 \\ 0 & -1 & 1 & 0 \\ 0 & 0 & -1 & 1 \end{bmatrix}\\[5pt] \text{eigvec}\big{(}\frac{d}{dt}\big{)} = \Omega_4 \end{gather}

The derivative operator is diagonalized on the Fourier basis. I.e. it becomes expressed as pointwise scaling. (More generally, the Fourier basis is tied to the roots of unity and diagonalizes any "circulant" operator, i.e. convolutions). Let $D_t$ be some linear combination of derivative operators wrt $t$ and let $\Lambda_\omega$ be a diagonal operator of its eigenvalues. Then "taking the Fourier transform of a linear differential equation" is the following change of basis, \begin{align} D_t f_t &= h_t \tag{differential system on $t$}\\ D_t F_{t \omega} f_\omega &= F_{t \omega} h_\omega\\ F_{t \omega}^{-1} D_t F_{t \omega} f_\omega &= h_\omega\\ \Lambda_\omega f_\omega &= h_\omega \tag{decoupled system on $\omega$}\\ \end{align}

Hope this helps! For the record, it's not hard to find documents out there explaining these exact ideas in a more rigorous way. Maybe this answer will make it easier to break into some of those more precise explanations. Good luck!

Related Question