Resetting GPU and driver after CUDA error

Because the same problem occurs sometimes on unix and google forwarded me to this thread, I hope this helps somebody else..

On ubuntu unloading and reloading the nvidia kernel module solved the problem for me:

sudo rmmod nvidia_uvm
sudo modprobe nvidia_uvm

To reset the graphics stack in Windows, press Win+Ctrl+Shift+B.


Edit:

If you are on Tesla hardware on Linux and can run nvidia-smi, then you can reset the GPU using

nvidia-smi -r

or

nvidia-smi --gpu-reset

Here is the man output for this switch:

Resets GPU state. Can be used to clear double bit ECC errors or recover hung GPU. Requires -i switch to target specific device. Available on Linux only.

Otherwise...


The way to truly reset the hardware is to reboot.

What you describe shouldn't happen. I recommend testing with different hardware and let us know if it still occurs.

Tags:

Windows

Cuda

Gpu