A lot of scientific software, codes or libraries can be parallelized via MPI or OpenMP/Multiprocessing.

Avoid submitting inefficient Jobs!

If your code can be parallelized only paritially (serial parts remaining), familiarize with Amdahl's law and make sure your Job efficiency is still well above 50%.

Default Values

Slurm parameters like --ntasks and --cpus-per-task default to 1 if omitted.

Pure MPI Jobs (n tasks)

#!/usr/bin/env bash

#SBATCH --job-name=test
#SBATCH --partition=epyc
#SBATCH --mail-type=END,INVALID_DEPEND
#SBATCH --mail-user=noreply@uni-a.de
#SBATCH --time=1-0

# Request memory per CPU
#SBATCH --mem-per-cpu=1G
# Request n tasks per node
#SBATCH --ntasks=n
# If possible, run all tasks on one node
#SBATCH --nodes=1

# Load application module here if necessary

# No need to pass number of tasks to srun
srun my_program

If --nodes=1 is omitted and all cluster nodes are almost full, Slurm might distribute a variable number of tasks on a variable number of nodes. Try to avoid this scenario by always setting a fixed number or a range of nodes via --nodes=a or --nodes=a-b with a ≤ b.

srun is the Slurm application launcher/job dispatcher for parallel MPI Jobs and (in this case) inherits all the settings from sbatch . This is the preferred way to start your MPI-parallelized application.

discouraged use of mpirun

The use of mpirun is heavily discouraged when queuing your Job via Slurm.

Pure MPI Jobs (n×m tasks on m nodes)

#!/usr/bin/env bash

#SBATCH --job-name=test
#SBATCH --partition=epyc
#SBATCH --mail-type=END,INVALID_DEPEND
#SBATCH --mail-user=noreply@uni-a.de
#SBATCH --time=1-0

# Request memory per CPU
#SBATCH --mem-per-cpu=1G
# Request n tasks per node
#SBATCH --ntasks-per-node=n
# Run on m nodes
#SBATCH --nodes=m

# Load application module here if necessary

# No need to pass number of tasks to srun
srun my_program

ProTip: Try to keep the number of nodes as small as possible. If n×m ≤ 128 --nodes=1 is always the best choice. This is due to latency of intra-node MPI communication (shared memory) being about two orders of magnitude lower than inter-node MPI communication (Network/Infiniband)

discouraged use of mpirun

The use of mpirun is heavily discouraged when queuing your Job via Slurm.

Bereichsverknüpfungen

Seitenhierarchie

Pure MPI Jobs (n tasks)

Pure MPI Jobs (n×m tasks on m nodes)

Bereichsverknüpfungen

Seitenhierarchie

Submitting Pure MPI Jobs

Pure MPI Jobs (n tasks)

Pure MPI Jobs (n×m tasks on m nodes)