NVIDIA NVML Driver/library version mismatch
Surprise surprise, rebooting solved the issue (I thought I had already tried that).
The solution Robert Crovella mentioned in the comments may also be useful to someone else, since it's pretty similar to what I did to solve the issue the first time I had it.
As etal said, rebooting can solve this problem, but I think a procedure without rebooting will help.
For Chinese, check my blog -> 中文版
The error message
NVML: Driver/library version mismatch
tell us the Nvidia driver kernel module (kmod) have a wrong version, so we should unload this driver, and then load the correct version of kmod
How can we do that?
First, we should know which drivers are loaded.
lsmod | grep nvidia
You may get
nvidia_uvm 634880 8
nvidia_drm 53248 0
nvidia_modeset 790528 1 nvidia_drm
nvidia 12312576 86 nvidia_modeset,nvidia_uvm
Our final goal is to unload nvidia
mod, so we should unload the module depend on nvidia
:
sudo rmmod nvidia_drm
sudo rmmod nvidia_modeset
sudo rmmod nvidia_uvm
Then, unload nvidia
sudo rmmod nvidia
Troubleshooting
If you get an error like rmmod: ERROR: Module nvidia is in use
, which indicates that the kernel module is in use, you should kill the process that using the kmod:
sudo lsof /dev/nvidia*
and then kill those process, then continue to unload the kmods.
Test
Confirm you successfully unload those kmods
lsmod | grep nvidia
You should get nothing. Then confirm you can load the correct driver:
nvidia-smi
You should get the correct output.