So, I'm currently trying to optimize a function that heavily uses FFT, and naturally the first thing I tried was packing most of the computations on the GPU. However, I noticed that somewhere in the middle of the algorithm the memory allocation on the GPU is much higher than warranted by the gpuArrays I have in the workspace (not in a sub function, and no persistent gpuArrays either).
I should add that some of the arrays being processed by the FFT are fairly large (e.g. a 278-by-68-by-32-by-56 complex double array). The code to simulate the problem lies becomes apparent when executing this:
% create large complex double array on the GPU (12GB card)
r = complex(gpuArray.randn([278, 68, 32, 56]), gpuArray.randn([278, 68, 32, 56]));
% this already allocates more than I would assume from (16 * numel(r))…?
% run an inverse FFT over dim 1
ri = ifft(r, [], 1);
% this then allocates a whopping additional 4GB of GPU memory — WTF?
% trying to free memory
clear ri;
% no GPU memory freed
clear r;
% about 1.2 GB freed
When I began to debug this problem, I found the following, using this simple line
fft(complex(gpuArray.randn([4, 1]), gpuArray.randn([4,1])));
frees almost all of the additionally (superfluously?) allocated memory, and my strong hunch is that this is some internal buffer of the FFT library.
So, the question is: can I–or even should I–(safely) add this to my code in strategic places to free up GPU memory (for other, non-FFT related functions!). I'm asking in particular because I want to run two or three instances of MATLAB on the same machine (same GPU) but otherwise always run into out-of-memory (unexpected) errors when making calls to the FFT library functions…
Thanks a lot in advance! /jochen
Best Answer