2-dimensional Fourier transform interpretation

fourier analysisfourier transform

I have this image of a duck:

Taking the FFT of this image, I can generate two new images. A magnitude spectrum (right) and phase spectrum (bottom left).

I am trying to re-construct the image of the duck MANUALLY by adding together the 65,535 sine waves encoded in this information. I am running into a problem where my duck is reconstructed but looks a little funny.

If the 2D Fourier transform is as follows:

$f(x,y) = \sum\sum F[h,k]e^{-2\pi i(hx+ky+\phi)}$

I was hoping to be schooled in my interpretation of this. My sine wave amplitude is $F[h,k]$. My phase is $\phi$. My indices for the wave (the number of time it 'cuts' each respective axis are $h, k$.

So, what the magnitude spectrum is showing is:

The center is $h,k = 0,0$. The zero frequency component intensity represents the average intensity of the entire image.
Each coordinate, for example $(2, 2)$ from the center represents a wave with $(h, k) = (2, 2)$. The intensity of this pixel is an indication of the amplitude of the wave.
Similarly, in the phase spectrum, the values range between $(0, \pi)$ and represent the offset of this sine wave from the origin.

The code I am using to doing the following is below. I SHOULD be able to perfectly reconstruct the image, but I keep getting some weird looking duck. The following is the duck I get after adding together 65,535 waves:

I was hoping somebody could point out my error. Thank you!

EDIT

If I only add the first 128 rows of my 2D FFT arrays, I get a duck that looks closer to the original. I feel like it has something to do with the fact that you only need 1/2 the information from these spectra, but I am adding together some of the wrong info. Like I need to cut each spectrum diagonally (about the symmetrical axis) and add only those waves…

EDIT

Final product…

import numpy as np
import matplotlib.pyplot as plt
import cv2

#Wave generator
def waveGenerator(h, k, a, phi, res):
    mesh = np.fromfunction(lambda x, y: a*np.sin(2*np.pi*h*x/res + 2*np.pi*k*y/res + phi), shape = (res,res))
    return mesh

# Import some image...
duck = cv2.imread("data/fourier_duck.png", 0)

#Take the fft of the duck
duck_fft = np.fft.fft2(duck)
duck_fshift = np.fft.fftshift(duck_fft)
magnitude_duck = np.log(np.abs(duck_fshift) + 1)
unshifted = np.abs(duck_fft)
phases = np.angle(duck_fft)

#THE RECONSTRUCTED IMAGE (MANUALLY)
recon_img = np.full((256, 256), unshifted[0][0])

for h in range(len(unshifted)):
    for k in range(len(unshifted[h])):
        recon_img = np.add(recon_img, waveGenerator(h, k, unshifted[h][k], phases[h][k], 256))
        print(str(k) + ',' + str(h))

plt.imshow(recon_img)
plt.show()

Best Answer

Try changing:

a*np.sin

a*np.cos

in your waveGenerator function. When the phase is 0, the real part of the wavefunction should be maximum which corresponds to a cosine: i.e. cos(0)=1. The rest seems to be okay, other than scaling issues I believe. I hope this helps.

Simplified test program below adapted from the question:

import numpy as np
import matplotlib.pyplot as plt
#import cv2

#Wave generator
def waveGenerator(h, k, a, phi, res):
    mesh = np.fromfunction(lambda x, y: a*np.cos(2*np.pi*h*x/res + 2*np.pi*k*y/res + phi), shape = (res,res))
    return mesh

# Import some image...
#duck = cv2.imread("data/fourier_duck.png", 0)
n_size = 64
duck = np.zeros((n_size,n_size))
duck[n_size//2][n_size//2]=1.0
duck[(n_size+10)//2][(n_size+5)//4]=2.0
plt.imshow(duck)
plt.show()

#Take the fft of the duck
duck_fft = np.fft.fft2(duck)
duck_fshift = np.fft.fftshift(duck_fft)
magnitude_duck = np.log(np.abs(duck_fshift) + 1)
unshifted = np.abs(duck_fft)
phases = np.angle(duck_fft)

#THE RECONSTRUCTED IMAGE (MANUALLY)
recon_img = np.full((n_size, n_size), unshifted[0][0])

for h in range(len(unshifted)):
    for k in range(len(unshifted[h])):
        recon_img = np.add(recon_img, waveGenerator(h, k, unshifted[h][k], phases[h][k], n_size))

plt.imshow(recon_img)
plt.show()

Related Solutions

[Math] Four-dimensional Fourier transform

I know it has been some time since you asked the question, but I was thinking about that the other day, too, which is why I would like to share my answer:

I was wondering, is there a definite mathematical reason for the sign difference between $\mathbf k \cdot \mathbf x$ and $\omega t$ in the exponent?

No, there is not. Suppose you want to solve a differential equation using Fourier Transforms. You get a mathematically valid solution for both cases, using the same relative sign or a different one. However, in a physics context, this is different, because it introduces a meaning to "time", which mathematics does not care about.

The mathematical solution for the $i(\mathbf k\cdot \mathbf x+\omega t)$ convention corresponds to a universe where time goes backwards compared to our physical universe:

If you look, for example, at an object moving with velocity $c$ in one dimension, its position $r$ is given by something like: \begin{equation}r = r_0+ct \end{equation} The physical picture at time $t=t_0$ is equivalent to the situation at $\tilde t_0=0$ for an object shifted in the opposite direction.

\begin{equation} r = r_0+ct_0 \iff r - ct_0 = r_0 \iff \tilde r = r_0 +c\tilde t_0 \end{equation} where $\tilde r := r-ct_0$.

In a way, you could say that in our universe time has the opposite direction to space. This is accounted for in the Minkowski metric, which is why the explanation with four-vectors works, too. However, as I tried to show with the example above, people knew this before special relativity and Lorentz transformations were a thing.

I hope this was helpful.

Understand uncertainty principle from Fourier transform – Interpretation of Fourier transform on a restricted periodic function –

I am answering my own question because I found the error. It was something in the code. I'm leaving the results here if anyone finds it useful or maybe wants to have a bit of fun with it:

import matplotlib.pyplot as plt
import matplotlib.gridspec as gridspec
import numpy as np

w = np.linspace(-10, 10, 1000)
w1 = 2 * np.pi
A = np.array([-10, -5, -2, -1, -0.5])
B = -A

domain1 = np.linspace(A[0], B[0], 1000)
domain2 = np.linspace(A[1], B[1], 1000)
domain3 = np.linspace(A[2], B[2], 1000)
domain4 = np.linspace(A[3], B[3], 1000)
domain5 = np.linspace(A[4], B[4], 1000)

func1 = np.sin(w1 * domain1)
func2 = np.sin(w1 * domain2)
func3 = np.sin(w1 * domain3)
func4 = np.sin(w1 * domain4)
func5 = np.sin(w1 * domain5)

im1 = (np.sin((w1 - w) * A[0]) - np.sin((w1 - w) * B[0])) / (2 * (w1 - w)) + (
            np.sin(-(w1 + w) * A[0]) - np.sin(-(w1 + w) * B[0])) / (2 * (w1 + w))
im2 = (np.sin((w1 - w) * A[1]) - np.sin((w1 - w) * B[1])) / (2 * (w1 - w)) + (
            np.sin(-(w1 + w) * A[1]) - np.sin(-(w1 + w) * B[1])) / (2 * (w1 + w))
im3 = (np.sin((w1 - w) * A[2]) - np.sin((w1 - w) * B[2])) / (2 * (w1 - w)) + (
            np.sin(-(w1 + w) * A[2]) - np.sin(-(w1 + w) * B[2])) / (2 * (w1 + w))
im4 = (np.sin((w1 - w) * A[3]) - np.sin((w1 - w) * B[3])) / (2 * (w1 - w)) + (
            np.sin(-(w1 + w) * A[3]) - np.sin(-(w1 + w) * B[3])) / (2 * (w1 + w))
im5 = (np.sin((w1 - w) * A[4]) - np.sin((w1 - w) * B[4])) / (2 * (w1 - w)) + (
            np.sin(-(w1 + w) * A[4]) - np.sin(-(w1 + w) * B[4])) / (2 * (w1 + w))

re1 = (np.cos((w1 - w) * A[0]) - np.cos((w1 - w) * B[0])) / (2 * (w1 - w)) + (
            np.cos(-(w1 + w) * A[0]) - np.cos(-(w1 + w) * B[0])) / (2 * (w1 + w))
re2 = (np.cos((w1 - w) * A[1]) - np.cos((w1 - w) * B[1])) / (2 * (w1 - w)) + (
            np.cos(-(w1 + w) * A[1]) - np.cos(-(w1 + w) * B[1])) / (2 * (w1 + w))
re3 = (np.cos((w1 - w) * A[2]) - np.cos((w1 - w) * B[2])) / (2 * (w1 - w)) + (
            np.cos(-(w1 + w) * A[2]) - np.cos(-(w1 + w) * B[2])) / (2 * (w1 + w))
re4 = (np.cos((w1 - w) * A[3]) - np.cos((w1 - w) * B[3])) / (2 * (w1 - w)) + (
            np.cos(-(w1 + w) * A[3]) - np.cos(-(w1 + w) * B[3])) / (2 * (w1 + w))
re5 = (np.cos((w1 - w) * A[4]) - np.cos((w1 - w) * B[4])) / (2 * (w1 - w)) + (
            np.cos(-(w1 + w) * A[4]) - np.cos(-(w1 + w) * B[4])) / (2 * (w1 + w))

mod1 = np.sqrt(im1 ** 2 + re1 ** 2)
mod2 = np.sqrt(im2 ** 2 + re2 ** 2)
mod3 = np.sqrt(im3 ** 2 + re3 ** 2)
mod4 = np.sqrt(im4 ** 2 + re4 ** 2)
mod5 = np.sqrt(im5 ** 2 + re5 ** 2)

phase1 = np.arctan(im1 / re1)
phase2 = np.arctan(im2 / re2)
phase3 = np.arctan(im3 / re3)
phase4 = np.arctan(im4 / re4)
phase5 = np.arctan(im5 / re5)

fig = plt.figure()
grid = gridspec.GridSpec(nrows=3, ncols=2, figure=fig)
ax1 = fig.add_subplot(grid[0,:])
ax2 = fig.add_subplot(grid[1,0])
ax3 = fig.add_subplot(grid[1,1])
ax4 = fig.add_subplot(grid[2,0])
ax5 = fig.add_subplot(grid[2,1])

ax1.plot(w, mod1, label=r'$x\in [-10;10]$')
ax1.set_title(r'$|X(j\omega)|$')
ax1.grid(True)
ax1.vlines(w1, 0, 1, linestyles="dashed")
ax1.vlines(-w1, 0, 1, linestyles="dashed")
ax1.set_xlabel(r'$\omega$')
ax1.legend()

ax2.plot(w, mod2, label=r'$x\in [-5;5]$')
ax2.set_title(r'$|X(j\omega)|$')
ax2.grid(True)
ax2.vlines(w1, 0, 1, linestyles="dashed")
ax2.vlines(-w1, 0, 1, linestyles="dashed")
ax2.set_xlabel(r'$\omega$')
ax2.legend()

ax3.plot(w, mod3, label=r'$x\in [-2;2]$')
ax3.set_title(r'$|X(j\omega)|$')
ax3.grid(True)
ax3.vlines(w1, 0, 1, linestyles="dashed")
ax3.vlines(-w1, 0, 1, linestyles="dashed")
ax3.set_xlabel(r'$\omega$')
ax3.legend()

ax4.plot(w, mod4, label=r'$x\in [-1;1]$')
ax4.set_title(r'$|X(j\omega)|$')
ax4.grid(True)
ax4.vlines(w1, 0, 1, linestyles="dashed")
ax4.vlines(-w1, 0, 1, linestyles="dashed")
ax4.set_xlabel(r'$\omega$')
ax4.legend()

ax5.plot(w, mod5, label=r'$x\in [-0.5;0.5]$')
ax5.set_title(r'$|X(j\omega)|$')
ax5.grid(True)
ax5.vlines(w1, 0, 1, linestyles="dashed")
ax5.vlines(-w1, 0, 1, linestyles="dashed")
ax5.set_xlabel(r'$\omega$')
ax5.legend()

plt.tight_layout()
plt.show()

All this work follows from 3Blue1Brown video linking fourier transform and the quantum uncertainty principle. In the figure the legend tells us the time domain of x(t). Notice that the wider the time domain the more acute the maximums of the fourier transform get. That is because, if we are certain of where a particle is then its wave function will collapse to a dirac (I think so at least) and so its fourier transform will be a constant, that is, if we know exactly where a particle is, we have no idea what its frequency, and thus energy, is, it could be everywhere with the same likelihood in the frequency domain. As for the other limit, if we have no idea where a particle is, than its wave function spreads to infinity and the fourier transform converges to two diracs (positive and negative frequencies) giving us the exact frequency of the particle! Therefore, the the more certain we are of where a particle is, the thiner it's time domain has to be and its frequency domain is ever larger. We will never know for certain where and what its energy are at the same time. The fourier transform explains this concept beautifully.

Best Answer

Related Solutions

[Math] Four-dimensional Fourier transform

Understand uncertainty principle from Fourier transform – Interpretation of Fourier transform on a restricted periodic function –

Related Question