Where are the imaginary components in a moment generating function (MGF) of a distribution

laplace transformmoment-generating-functionsprobabilityprobability distributions

An MGF is

$$M_X(t)=E(e^{tX})=\int_{-\infty}^\infty e^{tx}f_X(x) dx$$

whereas a Laplace transform is

$$\mathcal L\{f_X(x)\}(s)= \int_0^\infty e^{-sx}f_X(x)dx$$

I am not referring to the sign difference, or the limits of integration.

My question is about $s\in \mathbb C$ being a complex number in the Laplace transform, whereas I only see real numbers in the idea of the MGF (i.e. series expansion of $e^{tx}$). Indeed, $M_X:\mathbb R \to [0,\infty]$ with the domain of $M_X$ defined as the set $D_X=\{t \mid M_X(t)<\infty\}.$ In characteristic functions complex numbers are introduced, but they are Fourier transforms, not Laplace. Obviously nobody really cares, but I don't understand why: the LT captures oscillations and exponential decay (in standard engineering uses), whereas the FT only captures sinusoidal components as in this presentation:

enter image description here

The imaginary numbers do surface, but in the characteristic function of a distribution.

THE QUESTION: So is there really an imaginary component in the MGF (like in the Laplace), and if there is not, why is it taught that the MGF is the Laplace transform of the pdf?

My intuition is that in probability the Laplace transform with domain in the complex numbers has been somewhat limited to $t\in \mathbb R,$ corresponding to a segment of the real line around zero in which the $E[e^{tX}]$ for different reasons:

  1. To be able to extract moments through the Taylor series.
  2. To classify distributions in relation to the exponential distribution, which is very useful in extreme value theory – analysis of the tails. This is best exemplified in the Chernoff inequality:

$$\Pr(X \geq a) = \Pr\left(e^{tX} \geq e^{ta}\right)\leq \frac{E[e^{tX}]}{e^{ta}}= \frac{M_X(t)}{e^{ta}}=e^{-ta}M_X(t)$$

which indicates that a finite MGF will result in exponentially tapering tails.

The third reason (to uniquely determine the distribution) would not really be a reason to keep the MGF: The characteristic function does have this role with the additional advantage that it exists for all distributions.

So it is as if the Laplace transform was somehow limited to the real numbers (needless to say the usual random variables are real, but the domain of the MGF is not the random variable), so as to capture the exponential nature of the distribution, while the characteristic function deals with the complex part.

Best Answer

the MGF has real-valued domain and codomain, and this is different from the LT. It is logical to ask for why the MGF is said to be the LT, and no mention is made of this difference.

Billingsley does basically this in Probability and Measure, p. 278 (3rd edition):

The moment generating function is defined as $M(s) = E[e^{sX}] = [\ldots]$ for all $s$ for which this is finite [...] This function is also called the Laplace transform of $\mu$, especially in nonprobabilistic contexts.

Here he's implicitly using $s \in \mathbb{R}$. He gives no precise definition of "Laplace transform". Wikipedia's current definition of Laplace transform is rather vague (though understandably so) and implicitly restricts the domain to whatever complex numbers the integral converges for. It would certainly be reasonable to restrict the domain of the Laplace transform to non-negative real numbers for simplicity of convergence, and I imagine it would not be hard to find sources that do so. But let's stick to these definitions.

From a probabilistic perspective, moment generating functions are essentially interesting only when they converge in a real neighborhood of $0$. (Indeed, in that case they converge in the corresponding complex strip where the imaginary part is allowed to be arbitrary.) To say the moment generating function converges on the imaginary axis is no more than to say the characteristic function exists, which is of course fundamental, but since it's always true it's in some sense uninteresting.

It is sometimes convenient to view moment generating functions as complex analytic objects, depending on the context, but often the behavior along the real axis is the key bit, so it's understandable to focus on that. For a probabilist to say "the moment generating function is the Laplace transform", they mean no more and no less than "the integral is basically the same". There's no deeper meaning behind such statements, no lurking complex variable theory, or really any additional insight.

Edit in response to follow-up question: The remaining question concerns "sinusoidal structure in the pdf" and the notion that the Fourier transform (and hence the characteristic function) "scans for sinusoids" while the Laplace transform (and hence the moment generating function) "scans for exponentials and sinusoids". These terms are of course informal, and I do not know what the remaining mathematical question is, so I will try to make these notions more formal in the hopes of contributing useful discussion.

The notion that the Fourier transform "scans for sinusoids" is probably based on a conflation of the Fourier inversion formula and Fourier series. The Fourier inversion formula applies to integrals like $E[e^{isX}] = \int_{-\infty}^{\infty} e^{isx} f(x)\,dx$ over the whole real line. It allows us to recover $f(x)$ essentially by performing the same sort of integral again, but this time over $s$, and toggling signs. Fourier series instead allow us to expand a periodic function on $[0, 2\pi]$ as an infinite sum of the form $f(t) = \sum_{n=0}^\infty a_n \cos(nt) + b_n \sin(nt)$, or $\sum_{n=0}^\infty c_n e^{int}$. The coefficients are given by $c_n = \frac{1}{2\pi} \int_0^{2\pi} e^{-nt} f(t)\,dt$, which is extremely similar to $E[e^{inX}]$.

Now, in Fourier series it makes perfect sense to say your integral is "scanning for sinusoids", since it's literally picking off coefficients of a series expansion. For the general Fourier inversion on the whole real line, that interpretation becomes more tenuous. For instance, if $f \colon \mathbb{R} \to \mathbb{R}$ is given by $f(x) = \cos(x)$, the analogy with Fourier series would say that you should get a delta function recording this one non-zero coefficient, but the integral $\int_{-\infty}^\infty e^{isx} \cos(x)\,dx$ doesn't even converge (for any $s \in \mathbb{C}$!). So, this idea quickly runs into difficulties, though it can be salvaged if one is willing to allow "generalized functions" and the like into the picture (e.g. like this). Nonetheless, the rough notion that $E[e^{isX}]$ is somehow "scanning for sinusoids" and that $E[e^{isX}]$ is the "frequency $s$ component" of $f(x)$ seems reasonably unobjectionable at the level of a heuristic.

The notion that the Laplace transform "scans for exponentials and sinusoids" is probably even more tenuous from a rigorous perspective. One immediate issue is that the Laplace transform is often taken to be an integral over $[0, \infty)$, whereas the moment-generating function integral $E[e^{sX}]$ is over $(-\infty, \infty)$. This matters, since for instance at $s=1$ we have $e^x$ which decays to the left but increases without bound to the right. This is a hint that, for the moment generating function to converge in a real neighborhood of $0$, we'll need it to decay at least exponentially in both tails. The Cauchy distribution decays only quadratically, which is a hint that it is in some ways pathological.

But in any case, the Laplace transform is not picking off some coefficient in some series expansion. Its inverse is somewhat more involved than the Fourier transform's inverse (it involves a contour integral in the complex plane). Here's some brief discussion about the sense in which the Laplace transform is picking off coefficients; it's broadly similar to but less satisfying than the points above about the Fourier transform.