[Math] Characteristic functions as a Fourier transform

fourier transformintuitionprobability theory

I know that the characteristic function of a given probability density function is unique and this fact is used when proving some useful limit properties like the central limit theorem.

However, as far as I understand, the Fourier transform is well defined for periodic functions, not for nonperiodic ones. My question is: why does the method of the Fourier transform work even though the probability density functions we deal with are not usually periodic on the real line?

Best Answer

The Fourier Series


A Fourier series (countable expansion of sines and cosines) is only defined for periodic functions because the sines and cosines in the series have frequencies which are assumed to be harmonics (integral multiples) of the original function's frequency. Any countable sum of sines and cosines is a periodic function so long as their periods have an LCM (e.g. see Sum of two periodic functions is periodic?), which is certainly the case when all the functions are harmonics of some fundamental (all the quotients of periods are rational, so there must be an LCM), so the Fourier series, when it exists, must converge to a periodic function.

The Fourier Transform


A Fourier transform exists for any real function $f(x)$ for which the integral $$ \int_{-\infty}^{\infty}f(x)e^{i\omega x}dx $$ exists (the same as $E[e^{i\omega x}]$). It doesn't need to be periodic because the definition says so.

But there's also some intuition here. You can informally think of the Fourier series of a function as the coefficients of its representation in the basis $\Omega = \{e^{i\omega x} \ | \ \omega \in [0,\infty)\}$ (I say informally because there are many subtle issues with uncountable bases--for instance, the dimension of the space, if defined as the number of elements in the basis, is ambiguous, and there is even the possibility of having two bases, one properly containing the other, which span the same space). If we were to write it this way, we would need some function $F(\cdot)$ so that $F(\omega)$ would return the coefficient of $e^{i\omega t}$ for any $\omega \in [0,\infty)$. In other words, we would have

$$ f(x) \ ``=" \sum_{\omega\in [0,\infty)} F(\omega)e^{i\omega x}, $$

which is for all practical purposes the same as

$$ f(x) = \int_{0}^\infty F(\omega) e^{i\omega x}d\omega. $$

Notice that we are admitting all real frequencies, including irrational frequencies, so integrals over $\Omega$ like the one above don't necessarily result in a periodic function. That is, neither $F(\omega)$ nor $f(x)$ must be (nor in general will they be) periodic. What we now have is technically known as a Fredholm integral equation of $F(\omega)$, and one which has the solution

$$ F(\omega) = \int_{-\infty}^{\infty}f(x)e^{-i\omega x}dx, $$

which is the same as the Fourier transform.

The Characteristic Function: Or, Why Probability Theorists Care


The characteristic function is therefore just the Fourier transform of the PDF. Okay. But why is it important to know the Fourier transform of the PDF?

The answer is that, if the PDF of the random variable $X$ is $f$, then derivatives of $E[e^{i\omega X}]$ are proportional to the moments of $f(x)$ through known factors. Since $e^a = 1+ a + \frac{a^2}{2!} + \cdots$ converges for all $a$, then it might be the case that the convergence is uniform, e.g. that $$ \int_{-\infty}^{\infty}f(x)e^{-i\omega x}dx = E[e^{-i\omega X}] = E\left[1- i\omega X + \frac{\omega^2}{2!}X^2-\cdots \right] = E[1] - i\omega E[X] + \frac{\omega^2}{2!}E[X^2]-\cdots $$ In fact this is true. So you then have

$$ i\frac{dE[e^{i\omega X}] }{d\omega}|_{\omega = 0}= E[X],\qquad \frac{d^2E[e^{i\omega X}]}{d\omega^2}|_{\omega = 0} = E[X^2] $$ and so on. Knowing the characteristic function means you get to take derivatives (easy) rather than integrals (hard) to get moments. This correspondence also gives you a way of approximating the PDF from the first $n$ moments of the distribution. Just figure out the proportionality factors up to $n$, add them together with the moments, and take the inverse Fourier transform.

Related Question