MATLAB: After reboot of the linux system (Ubuntu 18.04) Matlab (R2017a, R2019a) does not find the GPU anymore. However, it (and Cuda too) is still there

cudagpu not foundMATLABParallel Computing Toolbox

Dear Sir or Madam,
after rebooting my linux system today, Matlab (R2017a, R2019a) does not find my GPU anymore.
The error message is:
gpuDevice
Error using gpuDevice (line 26)
No supported GPU device was found on this computer. To learn more about
supported GPU devices, see <a
href="matlab:web('http://www.mathworks.com/gpudevice','-browser')">www.mathworks.com/gpudevice</a> }
However, I have not changed anything on hardware and software. Furthermore, I have re-installed CUDA, but it still does not work anymore?
Is that a – to anybody – known issue?
How to solve that?
Usually I am working remote on that machine (via nx-server or via terminal). Both variants do not work.
Many thanks in forward for any idea.
With kind regards

Best Answer

Note that when you say "install CUDA", it's not clear if you mean you installed the CUDA Toolkit/SDK, or installed the driver. MATLAB only needs the driver installed and running to use the GPU, the toolkit/SDK is not necessary. So I'll assume that something is wrong with your driver.
  • Run "nvidia-smi" in a command line and see if the GPU is reported correctly. It should look something like this:
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 440.33.01 Driver Version: 440.33.01 CUDA Version: 10.2 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 Tesla K80 Off | 00000000:04:00.0 Off | 0 |
| N/A 37C P0 57W / 149W | 0MiB / 11441MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
| 1 Tesla K80 Off | 00000000:05:00.0 Off | 0 |
| N/A 43C P0 70W / 149W | 0MiB / 11441MiB | 51% Default |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| No running processes found |
+-----------------------------------------------------------------------------+
  • If nvidia-smi isn't working, the driver is not running at all. You should try reinstalling the latest driver, and also check that your system is finding the GPU. To see if the system sees the GPU, use
% lspci | grep -i nvidia
04:00.0 3D controller: NVIDIA Corporation GK210GL [Tesla K80] (rev a1)
05:00.0 3D controller: NVIDIA Corporation GK210GL [Tesla K80] (rev a1)
  • If the GPU shows up in lspci but nvidia-smi doesn't find anything, your driver is not working at all -- reinstall it.
  • If it's working, check the permissions on your GPU device, it should look something like the following. I've seen cases where the GPU is only accessible to root.
% ls -l /dev/nv*
crw-rw-rw- 1 root root 195, 0 Jan 16 16:17 /dev/nvidia0
crw-rw-rw- 1 root root 195, 1 Jan 16 16:17 /dev/nvidia1
crw-rw-rw- 1 root root 195, 255 Jan 16 16:17 /dev/nvidiactl
crw-rw-rw- 1 root root 243, 0 Jan 16 16:17 /dev/nvidia-uvm
crw-rw-rw- 1 root root 243, 1 Jan 16 16:17 /dev/nvidia-uvm-tools
  • If the permissions are wrong, on some driver versions you can set NVreg_DeviceFileMode=0660 to NVreg_DeviceFileMode=0666. The conf file you edit is in /etc/modprobe.d, it's generally called something with "nvidia" in it, for example 50-nvidia.conf. Note that this setting may be depricated, I haven't seen it in a while.
Check and see if the nvidia module is loaded, and that "nouveau" is not.
% lsmod | grep nvidia
nvidia_uvm 917504 0
nvidia_drm 45056 2
nvidia_modeset 1110016 1 nvidia_drm
nvidia 19894272 2 nvidia_modeset,nvidia_uvm
drm_kms_helper 155648 2 mgag200,nvidia_drm
drm 360448 10 mgag200,ttm,nvidia_drm,drm_kms_helper
ipmi_msghandler 49152 2 nvidia,ipmi_si
% lsmod | grep nouveau
(there should be no output)
If nouveau is loaded, you need to blacklist it by adding a file in /etc/modprobe.d that says something like this. The driver installation process usually does this (I have a file called "nouveau-blacklist.conf"), but that might have been removed for some reason. You may not need the lvm-nouveau line, depending on what you are running.
blacklist nouveau
blacklist lvm-nouveau
This will require a reboot to realize the blacklisting.
Related Question