MATLAB: Question on periodic part extraction in a simple periodic picture with zeroing in FFT

fftfourier filteringImage Processing Toolboxperiodic picture

hey folks, i have a question on FFT filtering that i am working on. My goal is to separate out the periodic part and aperiodic part of a image. This idea comes from Michal Haindl and Martin Hatka's paper'Near-Regular Texture Synthesis'. And it is shown below in the FFT filter part.
I understand that the FFT spectrum of a image, and the low freq. part is most likely to be the periodic part. Since for noise, it is most likely to be in high freq.
For the above image. even though it is not perfectly periodic, but if we do the zeroing and then do the ifft, we should be able to have it doing the job for more or less, this is what I am thinking for now.
What I have done for so far it to fft2 the image and in order to get the spectrum, I had fftshift, and F=abs(F), F=log(F+1). And then I think I have three methods that I can try in order to do the zeroing: 1st, set a threshold and have everything below it to be zeroed. 2nd, manually zeroing some of the spikes(which i know PhotoShop can do in quick), and 3rd, applying some filter on it(which i will try in future if the 1st method can give some promising result).
I am trying the first one to firstly see how would the picture look like after some zeroing, and if it works out, I will go for the filtering. But the problem I am having is:
after doing all the fftshift(F) and abs(F) and F=log(F+1), the F that we have is not capable to do the IFFT. is it the right way to go back if I have F=exp(F)-1; img3=abs(fftshift(ifft2(F))); I get confused cuz what's coming out does not seem to be right direction. I know this will not work, but it should at least give back something that make sense I thought.
Also I was thinking is there any particular filters that you guys would think that is likely to work here.
Please let me know am I on the right track, any help would be very much appreciated

Best Answer

Well you got some concepts right and some wrong. Strong periodic patterns in the spatial image will give rise to "spikes" in the Fourier domain (spectral) image. If you filter out those spikes and inverse transform, in theory you should be able to get rid of the periodic pattern. See attached demo.
A typical images has a spectrum that is high in the middle (origin) and falls away as you move away from the origin towards higher spatial frequencies. These spikes ride on that spectrum like spikes on a mountain side. So it's often/usually not possible to threshold the spectrum and find a mask for the spikes because a given threshold will get the whole top of the mountain and miss the spikes sticking up from the lower slopes. You need to make the spectral image flatter so that you can use a threshold. For example, you can use a tophat filter or contrast limited adaptive histogram equalization (adapthisteq()). This can produce a fairly flat image with no mountain and just the spikes, which you can then threshold to produce the spike mask. You multiply the spike mask by the Fourier trnaform image to eliminate the spatial frequencies that gave rise to the periodic pattern, while leaving other spatial frequencies (that are responsible for the bulk of the image) intact. Now you don't multiply the mask by the log or the spectrum, and you don't exponentiate the spectrum after masking. You just take the log if you want to compress the dynamic range for easy viewing, but you don't do any math or filtering on it because inverse filtering that won't give you back the original image. You need to mask and inverse transform the original spectrum - that means both the real part and the imaginary part both need to be masked, and then you inverse FFT the whole complex thing.
Now recall from Fourier theory that big things and things spaced widely in the spatial domain give rise to energy near the middle of the spectrum, and small things and fine details are represented by energy out near the outer edges of the spectrum - that's where the high frequencies live. But it is incorrect to say "I understand that the FFT spectrum of a image, and the low freq. part is most likely to be the periodic part." That's not a valid generalization at all. Where the spikes lie depend on the spacing of the pattern. Widely spaced patterns will have narrowly spaced spike patterns in the Fourier domain, and vice versa. But each spike is a mini-spectrum of the periodic "unit" such as a circle or square.
Let's take an example - an array of squares like your paper's image. So the image is essentially
spatialImage = bigRect * (smallRect ** 2Dcomb)
where * is regular multiplication and is convolution.
What the heck is that you ask? A 2D comb function is just a rectangular array of dirac delta functions. Now the Fourier transform of a comb function is another comb function with inverse spacing. So smallRect is just one little square all by itself - a rectangle function in 2 dimensions. When we convolve one small rect square with a comb function, we get a whole array of little rectangles all over the place in a rectangular grid. In essence one small rect at the location of every delta function. OK, that's fine and understandable. So what would be get if we fft'ed that? Now you know that the FFT of a single rect is just a 2D sinc function. But since the FT of a comb is a comb, and convolution in the spatial domain is multiplication in the Fourier domain, FT(smallRect**2Dcomb) = 2DSinc * 2dcomb. So what you would get is a comb function (an array of pure spikes) that is modulated by a big wide envelope that has a sinc shape. So you see an array of spikes with light spikes and darker spikes in a rectangular bar pattern (where the sinc goes to zero). That is what would happen for an infinitely large pattern.
But you don't have an infinitely large pattern, do you? You have a finite image. Mathematically what is that? It's a big 2D rect function times your infinite image. Remember the infinite image was (smallRect 2Dcomb) but when we crop that to a finite size, we are essentially multiplying it by a big 2D rect function (bigger than the small rects that make up the array of small boxes). But the 2DRect is multiplied by the other infinite image so that means that it's convolved with the image in the Fourier domain. Recall that multiplication in one domain is convolution in the other domain, and vice versa? But the image rect is big so the FT of is is a small sinc function - smaller than the sinc function of a little box because sizes are inversely related. So what effect does this have? Well because we are cropping the pattern with a box, we get a small sinc convolved with the other pattern, which recall was a comb array modulated by a really big sinc. This makes a little tiny sinc function at the location of every spike in the Fourier domain. So what you see is an array of small 2D sinc functions with an overall widely spaced sinc modulating the intensity of the array of little sincs. So mathematically
FT(bigRect * (smallRect ** 2Dcomb)) = FT(bigRect) * FT(smallRect ** 2Dcomb)
due to linearity - the FT is a linear transform so that FT(A*B) = F(A) * F(B). So next,
= (small sinc) ** (comb modulated by large sinc)
= array of small sincs with broad modulation by a much larger sinc.
I hope that explains it some and that, after reading it 10 times, you can understand it.
For the pattern at the bottom of your image I'm pretty sure you would not be able to Fourier filter it to get a solid white background unless you just kept the single pixel at the center of the FT - the patterns are just too large relative to the image.
I'm not familiar with that paper you mentioned but it seems like it's aimed at producing an effect similar to the famous "image quilting" technique presented at Siggraph and given in this File Exchange submission http://www.mathworks.com/matlabcentral/fileexchange/35828--siggraph2002--image-quilting-texture-synthesize If you liked my explanation and it helped in your understanding, then maybe you could vote for it. Feel free to ask any clarifying questions.
Related Question