How to submit a job to any [subset] of nodes from nodelist in SLURM?
Some of the tasks are parallelized, hence use all the CPU power of a single node while others are single threaded.
I understand that you want the single-threaded jobs to share a node, whereas the parallel ones should be assigned a whole node exclusively?
multiple jobs should run at the same time on a single node.
As far as my understanding of SLURM goes, this implies that you must define CPU cores as consumable resources (i.e., SelectType=select/cons_res
and SelectTypeParameters=CR_Core
in slurm.conf
)
Then, to constrain parallel jobs to get a whole node you can either use --exclusive
option (but note that partition configuration takes precedence: you can't have shared nodes if the partition is configured for exclusive access), or use -N 1 --tasks-per-node="number_of_cores_in_a_node"
(e.g., -N 1 --ntasks-per-node=8
).
Note that the latter will only work if all nodes have the same number of cores.
None of the tasks should spawn over multiple nodes.
This should be guaranteed by -N 1
.
You can work the other way around; rather than specifying which nodes to use, with the effect that each job is allocated all the 7 nodes, specify which nodes not to use:
sbatch --exclude=myCluster[01-09] myScript.sh
and Slurm will never allocate more than 7 nodes to your jobs. Make sure though that the cluster configuration allows node sharing, and that your myScript.sh
contains #SBATCH --ntasks=1 --cpu-per-task=n
with n
the number of threads of each job.