MATLAB: Do I receive the “CUDA_ERRO​R_LAUNCH_T​IMEOUT” error when trying to run GPU code in Parallel Computing Toolbox 5.2 (R2011b)

Parallel Computing Toolbox

I am trying to run my computation on the GPU. When I execute my program I receive the following error message:
Warning: An unexpected error occurred during CUDA execution. The CUDA error
was: CUDA_ERROR_LAUNCH_TIMEOUT.
Error using arrayfun
The kernel execution failed because the CUDA driver timeout was encountered.

Best Answer

This is a limitation imposed on the Parallel Computing Toolbox by the underlying Operating System.
This error occurs when a gpuArray operation or a CUDA kernel code runs for a long time on a GPU that is used for both graphics rendering and CUDA computations. The error is triggered by the operating system, which limits the time that the GPU can dedicate to computations vs. rendering the user desktop.
This limit does not apply to GPUs which do not have the desktop extended to them, such as standalone GPU accelerators, or cards running under TCC driver on Windows. Therefore it is recommended that you run gpuArray and CUDA kernel evaluations on a GPU that is NOT attached to a display and does not have the Windows or Linux desktop extended onto it.
On the Windows Vista and Windows 7 operating systems, individual GPU kernels running on GPUs that are responsible for desktop rendering are limited to a 2-second runtime. On the Windows XP operating system, the runtime is limited to 5 seconds. Kernels that exceed this runtime limit trigger the Timeout Detection and Recovery (TDR) mechanism.
On the Linux operating system the runtime limit is 5 seconds for kernels running on a GPU that is used for both desktop rendering and GPU computations.
If your system has GPUs without a display attached, you can manually modify the system configuration to disable the TDR mechanism. Disabling the TDR timeout will allow kernels to run for extended periods of time without triggering an error. In addition, on Windows, Tesla cards running under Tesla Compute Cluster drivers are not subject to this limitation.
For information on TDR and how to modify or disable the timeout on Windows Vista and Windows 7 refer to:
Alternatively, you can segment your computation into several smaller computations that do not trigger the time out.
For more information, see: