MATLAB: Asynchronous GPU calculations

cudagpu

In the latest release, GPU calls execute asynchronously to the CPU. But do also GPU calls themselves run asynchronously? I.e. does multiple calls using e.g. feval(CUDA_kernel, …) execute at the same time, or does the GPU wait for subsequent calls to finish? Does one need to impose wait() between feval to guarantee that the calls execute in order on the GPU? Experimental tests indicate that wait() is not needed, but it would be nice with a proper guarantee.

Best Answer

The asynchronous nature of the kernel invocations should be completely transparent in terms of functionality (if it isn't, then that would be a bug), the "wait()" is required only when attempting to get timings for portions of code.
Related Question