FFT sampling: why is Nyquist frequency enough for inverse transform

fourier transformsampling-theory

I am trying to understand an aspect of the Fast Fourier Transform (FFT) that is not making sense to me, but I'm struggling to find the flaw in my reasoning.

Suppose I model a continuous signal at a sampling rate $f$. According to Nyquist, the maximum frequency I will be able to "recover" from this signal via a FFT is $f/2$. How come when I apply the inverse transform, I can recover the original sample in real space with a sampling rate $f$, and not a "worsened" version of the original sample that only has frequencies up to $f/2$.

In other words: the intuition I am getting is that when applying an FFT, one looses half the frequency information. However, this cannot be right, otherwise one could not invert the transform.

Could someone help me realize where is the flaw here?

Best Answer

You start with a signal having frequencies in the interval $[0,f]$. You take the FFT and get a spectrum with frequencies in $[-f/2,f/2]$. Then you apply the inverse FFT and recover the signal with frequencies in the interval $[0,f]$. Why do you think half the information vanishes in the middle step?

You might ask, how can signal components varying as fast as $f$ be represented in frequencies from $[-f/2, f/2]$? The answer is the wagon wheel effect and aliasing: all the spectrum originally in the interval $[f/2,f]$ is aliased into the $[-f/2,0]$ components. Then the inverse transform recovers the aliases back to their original frequencies.

This is not, however, magic. Time-domain frequencies in the interval $[f, 2f]$ also alias into the spectral interval $[-f/2,f/2]$ and your inverse transform will reconstruct them as if they were in the interval $[0,f]$. But it is exceptionally important to realize -- the discrete-time samples for aliased frequencies are the same. There is nothing in the time domain signal that distinguishes aliased frequencies. So the infinite family of aliases are all sent to one spectral bin and then reconstructed as if they had the smaller (positive) frequency (although the reconstructed sample values are the same as if the frequency was correctly recovered).

To convince yourself of some of this, actually sample signals of sinusoids having frequencies $-f/2$, $3f/2$, and $5f/2$. Then look at other sequences of aliases to firmly convince yourself that the sampled signals are identical. So there is no way for the inverse transform to know which member of an alias family to reconstruct -- and it doesn't have to. In the sample, there is no difference among the aliased frequencies.

So what is this all about? Suppose the original source was actually a pair of sinusoids at aliased frequencies. Apply the FFT and the inverse FFT and the result is at the lowest positive alias of both -- it is no longer a sum of two sinusoids, it's just one.

Related Solutions

[Math] How to interpret the results of a Discrete Fourier Transform

The discrete Fourier transform implicitly regards the input data as periodic. The length of the input data is an integer multiple of the frequencies that the output data correspond to. If you generate the input data from a sinusoidal that has a different period that doesn't evenly divide the length of the data, then you're effectively creating a jump at the point where the first data point follows the last in the periodic continuation. A jump has appreciable Fourier components at all frequencies, and this is superimposed on your signal.

Nevertheless, you do get a signal with a clear maximum around the input frequency. I tried this using this online FFT calculator. In your case, as you calculated, we'd expect the maximum to be around the $21$st and $22$nd output data point (with the first data point corresponding to frequency $0$), and indeed, the first $40$ output values are:

30.322000,0.000000
30.391603,-0.332305
30.619103,-0.635705
30.983132,-0.983397
31.541176,-1.314554
32.258933,-1.685996
33.213385,-2.076468
34.383753,-2.509669
35.850294,-2.989665
37.679693,-3.538320
39.950279,-4.172808
42.781442,-4.897551
46.400390,-5.801462
51.088228,-6.910079
57.343876,-8.338977
66.017176,-10.282208
78.721064,-13.088380
98.974578,-17.487260
136.068697,-25.448952
225.318492,-44.421047
727.024048,-150.774758
-543.414483,118.240910
-191.999756,43.727082
-114.594042,27.275221
-80.661116,20.008091
-61.661208,15.918785
-49.536603,13.289953
-41.141647,11.459233
-35.002598,10.087933
-30.321248,9.041060
-26.630074,8.216618
-23.664950,7.534937
-21.230810,6.967688
-19.199079,6.488785
-17.492157,6.083581
-16.004951,5.723743
-14.733225,5.416510
-13.616822,5.138707
-12.640339,4.905611
-11.768659,4.665358

So it seems you must have been doing something wrong when you got spikes all over the place at roughly the same amplitude.

You can actually calculate the output in closed form in this case by writing the sinusoidals in terms of complex exponentials and summing the resulting partial geometric series.

Edit:

Here's that last suggestion written out.

You have the time-domain signal $\sin2\pi\nu t$ with $\nu=440\text{Hz}$, and you sample it at the frequency $f=44100\text{Hz}$, so you have the samples

$$ s_k=\sin\frac{2\pi\nu k}f=\frac1{2\mathrm i}\left(\exp\left(\frac{2\pi\mathrm i\nu k}f\right)-\exp\left(-\frac{2\pi\mathrm i\nu k}f\right)\right) $$

for $0\le k\lt n$, with $n=2048$. Performing a discrete Fourier transform on these samples yields

\begin{eqnarray*} \tilde s_j &=& \sum_{k=0}^{n-1}s_k\exp\left(\frac{2\pi\mathrm i jk}n\right) \\ &=& \sum_{k=0}^{n-1}\frac1{2\mathrm i}\left(\exp\left(\frac{2\pi\mathrm i\nu k}f\right)-\exp\left(-\frac{2\pi\mathrm i\nu k}f\right)\right)\exp\left(\frac{2\pi\mathrm i jk}n\right) \\ &=& \frac1{2\mathrm i}\sum_{k=0}^{n-1}\left(\exp\left(2\pi\mathrm i\left(\frac jn+\frac\nu f\right)k\right)-\exp\left(2\pi\mathrm i\left(\frac jn-\frac\nu f\right)k\right)\right) \\ &=& \frac1{2\mathrm i}\sum_{k=0}^{n-1}\left(\omega_{j+}^k-\omega_{j-}^k\right) \\ &=& \frac1{2\mathrm i}\left(\frac{1-\omega_{j+}^n}{1-\omega_{j+}}-\frac{1-\omega_{j-}^n}{1-\omega_{j-}}\right)\end{eqnarray*}

with $\omega_{j\pm}=2\pi\mathrm i\left(\frac jn\pm\frac\nu f\right)$. The second term dominates and is maximal at $\frac jn=\frac\nu f$ (which in your case is at the fractional value $j=\frac{n\nu}f$). It's a phase factor times

$$ \frac{\sin n\pi\left(\frac jn-\frac\nu f\right)}{\sin\pi\left(\frac jn-\frac\nu f\right)}\;, $$

which is a discrete periodic analogue of the $\operatorname{sinc}$ function. Fitting to this functional form might improve your estimate of the frequency from the output data of the Fourier transform.

Nyquist-Shannon, aliasing and Fourier downsampling

The answer is in the first post you linked to:

To downsample, it transforms to the frequency domain and deletes the second and third groups of $N/4$ elements (which correspond to the half with the highest frequency components).

So apparently you confused the middle half of the elements as stored in memory with the middle half of the frequencies.

Best Answer

Related Solutions

[Math] How to interpret the results of a Discrete Fourier Transform

Nyquist-Shannon, aliasing and Fourier downsampling

Related Question