[Math] Form of the spectral density in Wiener Khinchin theorem

fourier analysisprobabilitystationary-processesstochastic-processes

The Wiener–Khinchin theorem says the autocorrelation function of a wide sense stationary process can be written as a Stieltjes integral, where the integrator function is called the power spectral distribution function. When the power spectral distribution function is absolutely continuous, its derivative is called the power spectral density, and the power spectral density and the autocorrelation function are a Fourier transform pair.

My question is when and how the power spectral density of the stationary stochastic process can be represented as in Wikipedia:
for a stationary process $x(t), t \geq 0$,

the power spectral density can be defined as
$$
S_{xx}(\omega) := \lim_{T \rightarrow \infty} \mathbf{E} \left[ | \frac{1}{\sqrt{T}} \int_0^T x(t) e^{-i\omega t}\, dt | ^ 2 \right]. $$

Thanks and regards!

Best Answer

The Wikipedia article is still under construction, and still contains errors. I am one of the editors.

The formula you cite is from the section of the article about the power spectral density of a stochastic process, and is rather sloppy, it still needs to be corrected. But the wordy definition you cite is from a different paragraph of the article, and applies first of all to an individual signal, i.e., a deterministic signal, i.e., a sample function of the process, ignoring the existence of all other sample functions, thus, ignoring the structure of the process. Secondly, it applies to a process too but only to the spectral decomposition of the process, and not to the formula you mention.

Now, the truth is this: given any function of time (a deterministic function of time) x(t), such that $$\lim_{T\rightarrow\infty} {1\over 2T} \int_{-T}^T x(t+\tau)x(t) dt \,\,\,\,\,\ \ \ \ \ \ \ (*)$$ exists for all $\tau$, then one can find a statistical distribution function $S$, called the power spectral distibution function of $x$, such that for almost all frequencies $f_1,f_2$, $S(f_2)-S(f_1)$ is the amount of power contributed to $x$ by frequencies in the band $[f_1,f_2]$ in the sense of the sum of the squares of the jumps at frequencies in that band of $s$, the generalised Fourier transform of $x$, defined by the limit in mean (i.e, the limit in an $L^2$ space, not a pointwise limit) of $$s(\omega) = \int_{-A}^{-1} x(t) {e^{-i\omega t}\over it} dt + \int_{-1}^{1} x(t) {e^{-i\omega t}-1\over it} dt + \int^{A}_{1} x(t) {e^{-i\omega t}\over it} dt \,\,\,\,\,\ \ \ (**)$$ as $A$ goes to infinity, with $\omega = 2\pi f$.

The first tricky bit is that $x$ will not usually have a Fourier transform, which is why we have to put a factor of $t$ in the denominator here, for convergence. If only $x$ had a Fourier transform $X$, this generalisation, $s$, would be the integral of $X$.

The second tricky bit is that even if $s$ is continuous, it might be so far away from being differentiable that its "infinitesimal" jumps contribute something to the power. For this reason, the intuitive notion of "sum of squares of the Fourier coefficients of $x$ " has to be interpreted as the sum of the squares of the jumps of $s$" which, in turn, has to be interpreted as $$\lim_{\epsilon\rightarrow0} {1 \over 2\epsilon} \int_{f_1}^{f_2} \vert s(f+\epsilon) - s(u-\epsilon) \vert ^2 du.$$ This succeeds in defining $S$ almost everywhere.

Now even if $S$ is not differentiable, it does define a distribution, and its derivative in the sense of a distribution can be defined as the power spectral density. But since $S$ can have jump discontinuities, its derivative can have delta functions in it.

=== The case of a stochastic process

Suppose now that $X(t)$ is a stochastic process. We must further assume that it is stationary (in the wide sense)---this assumption is analogous to assumption $(*)$ above for a deterministic signal. Then $X$ has a spectral decomposition, which is a rather sophisticated analogue of the Fourier transform of a deterministic function. It uses the notion of stochastic integration which is much more elementary than Ito's notion of a stochastic integral. See Gnedenko, Kyrc Torpii Vepoyath¿nocteu, Chapter 10, No. 56, ctp. 316. Provided these are understood in the sense of the limit in mean of stochastic processes, one can write a spectral decomposition of $X$ entirely analogous to $(**)$:

$$X(t) = \int_0^\infty \cos \omega t dZ_1(\omega) + \int_0^\infty \sin \omega t dZ_1(\omega) ,$$ where $$Z_1(t) = \lim_{T\rightarrow\infty} {1\over 2\pi} \int_{-T}^T X(t) {\sin \omega t\over t} dt$$

and $$Z_2(t) = \lim_{T\rightarrow\infty} {1\over 2\pi} \int_{-T}^T X(t) {1-\cos \omega t\over t} dt.$$

Now here, too, if the process is ergodic so that time averages can be replaced by ensemble averaging by taking the expectation operator $\bf E$, then the average power or variance contributed by the frequency $f$ can be found by looking at the expected value of the jump of $Z_i$ at $f$, i.e., studying $\bf E( \vert Z_i(\omega + \Delta\omega) - Z_i(\omega)\vert ^2 )$ etc. But at this point one bails and uses a theorem of Bochner, as generalised by Khinchin to the context of stochastic processes, and sees that this is equal to $F(\omega + \Delta\omega) - F(\omega)$ where $F$ is the statistical distribution function given by Bochner's theorem applied to the auto-correlation function of the process $X$.

==Now, as to the formula itself==

The formula you quote, Sxx(ω):=limT→∞E[|1T−−√∫T0x(t)e−iωtdt|2],

is not correct. I have never seen a reliable source that proves (or even asserts) that it converges. I see it a lot on the internet and in engineering textbooks: they never bother to assert that it converges. I computed an example for a line spectrum, it does not converge---admittedly, it should not converge since a line spectrum does not have a power spectral density.

Consider the right hand side without the expectation operator, as if for a deterministic signal. Then it does not converge even when the spectral density exists, since the sample paths of a noisy process have unbounded variation on any finite interval whatsoever. As far as I know, one must introduce a lag window factor to make it converge, i.e., something like Cesaro summation but for an integral instead of a series.

This topic is fraught with peril: a signal contaminated with noise is modelled by a function which is continuous but nowhere differentiable and with unbounded variation on any finite interval, so Fourier inversion never is valid. More generally, because of the nature of these signals, one can never be sure it is valid to interchange two limits.

One often hears hand-waving assertions to the effect that the use of Laurent Schwartz's method of distributions makes these formulas all right. But even with distributions, one still has to convolve with a lag window or a spectral window to make it converge. I have never seen proofs of these handwaving assertions, and the only careful statements I know (without proofs, but it is after all a handbook which omits the proofs), D. C. Champeney, A handook of Fourier theorems, Cambridge Univ. Press, does not treat stochastic processes.