MATLAB: Do I need ‘wait’ commands to assess GPU performance in the Parallel Computing Toolbox 6.0 (R2012a)

Parallel Computing Toolbox

The following script was used to compare performance of GPU vs CPU:
dev = gpuDevice;
wait(dev);
Narr=[1024:1024:5*1024];
for i = 1:numel(Narr)
N = Narr(i);
A=rand(N,N);
tic;
x1=fft(A);
cpuTime=toc;
Ag=gpuArray(A);
wait(dev);
tic;
x2=fft(Ag);
wait(dev);
gpuTime1=toc;
fprintf('Size = %d, speedup = %f\n',N,cpuTime/gpuTime1)
end
Why do we need to use the 'wait' function when timing GPU performance while this was not the case in MATLAB 7.13 (R2011b) and prior releases?

Best Answer

In MATLAB 7.14 (R2012a), GPU operations are asynchronous from CPU operations which means that MATLAB continues while the GPU is running. Thus, when assessing GPU performance, we need to include 'wait' commands to ensure that the GPU completed its work. Without waiting on the GPU, the tic/toc results do not make sense.
In previous releases (R2011b and earlier), MATLAB and the GPU were synchronous, so that any calls to the GPU had to complete before MATLAB proceeded to the next command.
This information is in the R2012a release notes as well: