MATLAB: Does gpuDevice take several minutes to complete

gpulinux

I am running R2016b (also tried R2015b) on CentOS 6.8 with CUDA Toolkit 8.0 installed. I also tried driver versions 367.57 and 370.28. The machine has a NVIDIA Titan X (Pascal) GPU installed. I can use the GPU fine using gpuArray etc, but every time I call gpuDevice the call takes sevral minutes to complete. It does finally give the correct answer, though. When I call it again, it completes immediately until I restart matlab, start another instance of matlab, or call gpuDevice([]). The NVIDIA SDK examples all run fine and without delay. What could be the cause for this and how do I fix it?
Thanks!
Related Question