How to get the ID of GPU allocated to a SLURM job on a multiple GPUs node?
You can get the GPU id with the environment variable CUDA_VISIBLE_DEVICES
. This variable is a comma separated list of the GPU ids assigned to the job.
You can check the environment variables SLURM_STEP_GPUS
or SLURM_JOB_GPUS
for a given node:
echo ${SLURM_STEP_GPUS:-$SLURM_JOB_GPUS}
Note CUDA_VISIBLE_DEVICES
may not correspond to the real value (see @isarandi's comment).
Also, note this should work for non-Nvidia GPUs as well.