[Math] Help understanding the output from Apache Commons Math’s Discrete Fourier Transform

fourier analysistransformation

I'm using a discrete Fourier transform to translate a finite set of samples to the frequency domain. I'm trying to start with a very simple set, but am still getting confused.
I'm starting with this set of samples:

$$\{ 1,0,1,0,1,0,1,0\}$$

I expected to get output indicating that the frequency domain representation of this set was expressed by
1*cos(t)
But instead I get:
Raw output:

$$\{ 4.0, 0.0, 0.0, 0.0, 4.0, 0.0, 0.0, 0.0\}$$

My (possibly flawed) interpretation of the output:

Frequency representation=

$$ 4.0\cos(2\pi/1)+$$
$$ 0.0\cos(2\pi/2)+$$
$$ 0.0\cos(2\pi/3)+$$
$$ 0.0\cos(2\pi/4)+$$
$$ 4.0\cos(2\pi/5)+$$
$$ 0.0\cos(2\pi/6)+$$
$$ 0.0\cos(2\pi/7)+$$
$$ 0.0\cos(2\pi/8)$$

Why are the coefficients $4$?

Why, when my source data has a very simple repeating pattern, do I have more than one entry in the frequency domain representation?

If my output interpretation of correct, how should I interpret the resulting function? Here's a graph.

I'm using an Apache math library's implementation of the DFT (Commons math 3-3.3).

Best Answer

Your input is not just cosine. Cosine function has zero mean. Your function has mean $1/2$. It is accurately represented by $$ \frac12 + \frac12 \cos 4 t,\quad t=0,\pi/4,\pi/2,3\pi/4,\dots,7\pi/4 $$ This means: coefficient $1/2$ of frequency $0$, coefficient $1/2$ of frequency $4$.

The output lists frequencies beginning with $0$. Also, the normalization of the transform in this routine happens to be such that instead of the Fourier coefficients themselves you get $N$ times them, where $N$ is the length of your vector. Dividing by $8$ gives the vector of coefficients.

Generally, the Fourier transform involves complex exponential functions $e^{ikt}=\cos kt+i\sin kt$. You don't see the sines here because the input data is "even".

The implementation you are using gives the same result as Matlab's fft command, so I refer there for the complete list of formulas for the direct and inverse transform implemented by this function.

suppose I had an input dataset that gave me both real and imaginary values in my coefficient vector. How can I usefully factor the imaginary values into a result? What do the imaginary values effectively 'mean'?

Let's take a more generic example (I rounded Matlab output below) :

fft([1 0 2 3 0 4 -1 0])  = [9, -4-2i, -i, 6+4i, -5, 6-4i, i, -4+3i]

The easiest way to see how the output $X$ represents the input $x$ is to look up the inverse command (ifft in Matlab), because this is the thing that recovers original vector. Its formula is

$$x(j) = \frac1N \sum_{k=1}^8 X(k) e^{-2\pi i(j-1)(k-1)/N} \tag{1}$$

(which is unfortunately mistyped on the Matlab page). Although both $X$ and the complex exponentials involve complex numbers, the result of summation will be real if your input $x$ is real. Therefore, we can simplify the matter by keeping only the real part of (1):

$$x(j) = \frac1N \sum_{k=1}^8 (\operatorname{Re}X(k))\cos( 2\pi i(j-1)(k-1)/N) \\+ \frac1N \sum_{k=1}^8 (\operatorname{Im}X(k))\sin( 2\pi i(j-1)(k-1)/N) \tag{2}$$

With specific numbers: $$x(j) = \frac18 (9-4\cos (2\pi (j-1)/8)+6\cos (2\pi (j-1)(4-1)/8) - 5\cos (2\pi (j-1)(5-1)/8) + \dots) + \frac18 ( -2 \sin(2\pi (j-1)/8)-\sin(2\pi (j-1)(3-1)/8) +4\sin(2\pi (j-1)(4-1)/8) \dots) $$ This should recover the input [1 0 2 3 0 4 -1 0] for $j=1,\dots,8$, but I was too lazy to type in all the terms. The formulas are easier to write if the arrays use $0$-based indexed, but Matlab has $1$-based indices.

Related Solutions

[Math] Space-time Fourier explainer

Angular frequency is the way it is because $\sin$ and $\cos$ have simple derivatives in radians, and period $2\pi$ in radians. So, in order to convert an ordinary frequency into one $\sin$, $\cos$, and $\operatorname{e}^{ix}$ understand requires a multiplication by $2\pi$.

$\mathbf{k}$ is a vector and it's known as the wavenumber. It's components are related to wavelength in different directions in the same way period is related to angular frequency.

The space-time Fourier transform is just four Fourier transforms, one for each dimension. Traditionally the sign convention is chosen so that a wave with angular frequency $\omega$ propagates in the direction $\mathbf{k}$ points. That means that: $$\begin{align} \tilde{f}(\omega,\mathbf{k}) &\propto \int f(t, \mathbf{x}) \operatorname{e}^{i\omega t - i \mathbf{k}\cdot\mathbf{x}} \operatorname{d} t \operatorname{d}^3x \\ f(t, \mathbf{x}) & \propto \int \tilde{f}(\omega,\mathbf{k}) \operatorname{e}^{-i\omega t + i\mathbf{k}\cdot \mathbf{x}} \operatorname{d}\omega \operatorname{d}^3k, \end{align}$$ where the constants of proportionality are chosen by convention, and different people use different conventions. I, personally, prefer the symmetric/unitary convention.

When you work in cylindrical and/or spherical coordinate systems you complicate matters a bit. Working in cylindrical coordinates moves you from straight Euclidean Fourier transforms into the realm of Hankel transforms in the radial direction and a discrete Fourier series in the aziumuthal angle. In the case of spherical coordinates, the radial transform becomes a spherical bessel function version of the Hankel transform and the angular coordinates become an expansion in spherical harmonics. These are all examples from linear algebra of writing a function in a particular orthonormal basis.

[Math] Fourier Curve Fitting

I'm getting a lot closer...

I found this video, but it conveniently cut off right before the useful part. Its as if the math gods are laughing at me.

I then found this excel sheet

So it appears the basic approach to determining the Fourier coefficients from sampled data is to sum up the individual terms over one cycle. Then averageish the sum to get the coefficient.

What I don't fully understand in the spreadsheet is why they are summing every other sample (i.e. call D7 ---> =C6+4*C7+C8) and dividing by 3/180 (see the cells in row 3).

I'm also not clear on what X would be in time sampled data. In the spreadsheet it is an angle. In sampled data, would it just be the percentage of the total sample time scaled to 2PI (i.e. the x value of the middle sample would be PI... the sample 25% of the way through the time period would be PI/2... 75% would be 3PI/2, etc.)?

It's missing from the spreadsheet, but if you do a column with this formula:

=$D$3/2+$E$3*COS(B6)+$F$3*COS(2*B6)+$G$3*COS(3*B6)+$H$3*COS(4*B6)+$I$3*COS(5*B6)+$J$3*COS(6*B6)+$L$3*SIN(B6)+$M$3*SIN(2*B6)+$N$3*SIN(3*B6)+$O$3*SIN(4*B6)+$P$3*SIN(5*B6)+$Q$3*SIN(6*B6)

And then plot that column and column C, the function above fits the measured data. You could probably drop off the coefficients that are really small, but that is data dependent.

Best Answer

Related Solutions

[Math] Space-time Fourier explainer

[Math] Fourier Curve Fitting

Related Question