Fourier Analysis – Effects of Zero-Padding Compared to Time-Domain Interpolation

While studying the various algorithm implementations available on-line of the Fast Fourier Transform algorithm, I've come to a question related to the way the DFT works in theory.

Suppose you have a sequence of $N$ points $x_0, …, x_{N-1}$. For $ k = 0, …, N-1 $, let $ X_k = \sum_{n=0}^{N-1} x_n e^{-2ik\pi \frac{n}{N}} $.

I've noticed that many algorithms are easier to implement, or faster, when the size of the input can be expressed as a power of 2. To pad the signal, I've seen two approaches.

Pad the signal with $0$s, settings $x_N, …, x_{2^p-1} = 0$, and $X_k = \sum_{n=0}^{N-1} x_n e^{-2ik\pi \frac{n}{2^p}}$
Interpolate the original values, by setting $\tau=N/2^p$ the new spacing between consecutive points and then guessing the values at $0, \tau, 2\tau, …, (2^p-1)\tau$ through linear interpolation.

I've heard people saying different things:

Some people oppose the first approach very strongly (I recently had a discussion with a physics teacher about this). They say that padding the signal with extra zeros give you the Fourier coefficients of a different function, which will bear no relation to those of the original signal. On the other hand, they say that interpolation works great.

On the other hand, most libraries, if not all, that I have reviewed use the second solution.

All the references that I could find on the internet were pretty vague on this topic. Some say that the best band-limited interpolation that you can do in frequency domain is obtained through time-domain padding, but I couldn't find any proof of such statement.

Could you please help me figure out which advantages and drawbacks both approaches have? Ideally, I'm searching for something with a mathematical background, not only visual examples =)

Thanks!

Best Answer

Zero-padding in the time domain corresponds to interpolation in the Fourier domain. It is frequently used in audio, for example for picking peaks in sinusoidal analysis.

While it doesn't increase the resolution, which really has to do with the window shape and length. As mentioned by @svenkatr, taking the transform of a signal that's not periodic in the DFT size is like multiplying with a rectangular window, equivalent in turn to convolving it's spectrum with the transform of the rectangle function (a sinc), which has high energy in sidelobes (off-center frequencies), making the true sinusoidal peaks harder to find. This is known as spectral leakage.

But I disagree with @svenkatr that zero-padding is causing the rectangular windowing, they are separate issues. If you multiply your non-periodic signal by a suitable window (like the Hann or Hamming) that has appropriate length to have the frequency resolution that you need and then zero-pad for interpolation in frequency, things should work out just fine.

By the way, zero-padding is not the only interpolation method that can be used. For example, in estimating parameters of sinusoidal peaks (amplitude, phase, frequency) in the DFT, local quadratic interpolation (take 3 points around a peak and fit a parabola) can be used because it is more computationally efficient than padding to the exact frequency resolution that you want (would mean a much larger DFT size).

Best Answer

Related Solutions

[Math] Discrete Fourier transform of a particular sequence of real numbers

[Math] Zero-padding data for FFT

Related Question