[Math] Fourier transform: noise and variance

fourier analysisnoisesignal processing

I wrote a short program to generate $N$ samples of a sinusoid with some noise (ie:

$$ f(t) = \cos(2\pi t) + 0.1 * \text{noise}(t) $$

where $\text{noise}(t)$ is chosen uniformly from $[-1 , 1]$.

Here is an example of $f(t)$ with 1000 samples:

Sample plot of f(t)

Then I found the amplitude of the fundamental using the Discrete Fourier Transform:

$$ 2F(1) = \frac{2}{N} \sum\limits_{n=0}^{N-1} f(\frac{n}{N}) e^{-i2\pi n/N} $$

I got my program to randomly generate this function 100 times and found the mean and standard deviation of the calculated fundamental. I ran this process using anywhere from 100 to 20000 samples. Here is a plot of my results (the points are the means and the error bars are the standard deviations):

DFT results

The means of this plot are close to 1, as I would expect (the sinusoid has an amplitude of 1). However, the standard deviation is decreasing as $\frac{1}{\sqrt{N}}$ (I got this from a log-log plot).

I can't figure out why the standard deviation changes with the number of samples! It intuitively makes sense that more samples = more precision, but I know that white noise has the same power at all frequencies.

Mathematically, why does the effect of noise decrease when I add more samples to the DFT?

Best Answer

Essentially, you are computing $\hat F_{k_1} = F_{k_1} + E_{k_1}$ where $F_k$ is the clean Fourier transform, and $E_k$ is the Fourier transform of $e(n)=0.1 \, n(t)$ - the noise.

By Parseval theorem, $\sum |e(n)|^2 =N\sum |E_k|^2 $. (Note that the linked formula has a $N$ in the denominator - the difference arises because you are using a normalized Fourier transform instead of the usual definition)

But $\sum |e(n)|^2 \approx N \sigma^2_e$ (assuming stationary iid noise) and $\sum |E_k|^2 \approx N |E_{k_1}|^2$ (assuming white noise, which have same energy in all frequencies). Putting all together, and assuming $N$ is large enough (so we can replace approximations by equalities), we get

$$ N \sigma^2_e = N^2 \, |E_{k_1}|^2 \implies |E_{k_1}| = \sqrt{\frac{\sigma^2_e}{N}}$$

As @Chinny84 comments, this can be seen as a consequence of the CLT, or, more elementarily, from the well-known fact that the variance of the sample mean decreases as $1/N$. Parseval theorem roughly tells us that averaging in frequency is the same as averaging in time.

Related Question