Array Jobs

Job arrays are used for running the same job a large number of times very efficiently for Slurm with only slight differences between the jobs. For instance, let's say that you need to run 100 jobs, each with a different seed value for the random number generator. Job arrays are the best choice for such cases.

If you ever feel the need to do something like this:

 for i in {1..10} ; do
     sbatch runscript.sh ${i}.in
 done

Don't do it! Use Job Arrays instead!

Below is an example Slurm script for running Python where there are 3 jobs in the array:

#!/bin/bash
#SBATCH --job-name=array-job     # create a short name for your job
#SBATCH --output=slurm-%A.%a.out # stdout file
#SBATCH --error=slurm-%A.%a.err  # stderr file
#SBATCH --nodes=1                # node count
#SBATCH --ntasks=1               # total number of tasks across all nodes
#SBATCH --cpus-per-task=1        # cpu-cores per task (>1 if multi-threaded tasks)
#SBATCH --mem-per-cpu=4G         # memory per cpu-core (4G is default)
#SBATCH --time=00:01:00          # total run time limit (HH:MM:SS)
#SBATCH --array=0-2              # job array with index values 0,1,2
#SBATCH --mail-type=all          # send email on job start, end and fault
#SBATCH --mail-user=<username>@uni-a.de

echo "My SLURM_ARRAY_JOB_ID is $SLURM_ARRAY_JOB_ID."
echo "My SLURM_ARRAY_TASK_ID is $SLURM_ARRAY_TASK_ID"
echo "Executing on the machine:" $(hostname)

# Loading modules here

srun python myscript.py $SLURM_ARRAY_TASK_ID

The key line in the Slurm script above is:

#SBATCH --array=0-2

Job Array limits

  • The array numbers i (values of SLURM_ARRAY_TASK_ID ) must satisfy the equation 0 ≤ i < 1001. If you need numbers outside of this range, see below.
  • You cannot have more than 1000 members of an Array Job.


In this example, the Slurm script will run three jobs. Each job will have a different value of SLURM_ARRAY_TASK_ID (i.e., 0, 1, 2). The value of SLURM_ARRAY_TASK_ID can be used to differentiate the jobs within the array. The following environment variables will be available:

SLURM_ARRAY_TASK_ID=0
SLURM_JOB_ID=354
SLURM_ARRAY_JOB_ID=353
SLURM_ARRAY_TASK_ID=0
SLURM_ARRAY_TASK_COUNT=3
SLURM_ARRAY_TASK_MAX=2
SLURM_ARRAY_TASK_MIN=0
SLURM_ARRAY_TASK_STEP=1
SLURM_ARRAY_TASK_ID=1
SLURM_JOB_ID=355
SLURM_ARRAY_JOB_ID=353
SLURM_ARRAY_TASK_ID=1
SLURM_ARRAY_TASK_COUNT=3
SLURM_ARRAY_TASK_MAX=2
SLURM_ARRAY_TASK_MIN=0
SLURM_ARRAY_TASK_STEP=1
SLURM_ARRAY_TASK_ID=2
SLURM_JOB_ID=353
SLURM_ARRAY_JOB_ID=353
SLURM_ARRAY_TASK_ID=2
SLURM_ARRAY_TASK_COUNT=3
SLURM_ARRAY_TASK_MAX=2
SLURM_ARRAY_TASK_MIN=0
SLURM_ARRAY_TASK_STEP=1

Note the SLURM_ARRAY_JOB_ID is the same for all sub-tasks in the Job Array, while the usual SLURM_JOB_ID variable will be different. Also for the last sub-task in a Job Array SLURM_JOB_ID will be equal to SLURM_ARRAY_JOB_ID .

You can set the array numbers to different sets of numbers or one or multiple ranges, for example:

#SBATCH --array=0,100,200,300,400,500
#SBATCH --array=1-24,42,56-99
#SBATCH --array=0-1000

Be sure to use the value of SLURM_ARRAY_TASK_ID to assign unique names to the output files for each job in the array. Failure to do this will result in all jobs using the same filename(s).

Each job in the array will have the same values for nodes, ntasks, cpus-per-task, time and so on. This means that job arrays can be used to handle everything from serial jobs to large multi-node cases.

See the SLURM documentation for more information.


Advanced Task IDs

You may also introduce an increment to array ranges, for example:

#SBATCH --array=0-7:2
# is the same as
#SBATCH --array=0,2,4,6
# may be even combined with multiple ranges
#SBATCH --array=0-2:2,4-9:2,12

 If you need numbers i outside the range 0 ≤ i < 1001, use bash arithmetics to generate the necessary numbers based on SLURM_ARRAY_TASK_ID .

MYVAR=$((SLURM_ARRAY_TASK_ID * 1000 + 25))
let MYVAR=SLURM_ARRAY_TASK_ID*1000+25 # modern bash
# if you need leading zeros:
MYVAR=$(printf "%05d" $myvar);

Limiting the number of concurrent Jobs in Array Jobs

By default, as many sub-tasks of an Array Job as possible will be started immediately. In the optimal scenario, all sub-tasks will be started at only given there are enough free ressources.

If you ever feel the need to limit the number n of concurrent sub-tasks in an Array Job, you just need to append "%n" to the array option, see for example:

#SBATCH --array=0-9%4
#SBATCH --array=0-2,4-9%4
#SBATCH --array=0-2,4-9%4
#SBATCH --array=0-2:2,4-9:2,12%4

Chain Jobs

If all your sub-tasks depend on one another, you can also limit the concurrency to n=1. It is common to talk about a Chain Job in this case, because a sub-task will only start of the previous one has ended.

#SBATCH --array=0-9%1

If your programm periodically writes its state to a file on disk and if your calculation may be resumed based on this state, then Chain Jobs may be used to overcome the partition time limit. Whenever one sub-task is killed by SLURM when running into the TIMEOUT event, the next sub-task will continue where it left off.

Beware

If you use this type of jobs you have to make sure to to put in a check if the progamm is already completed, and if yes, turn all remaining array tasks into a no-op that complete immediately. This is necessary because the number of tasks in a Job Array cannot be updated after submission. Do not set the number of repetitions ( 9 in this example) too high.

Schematic example to check if programm is completed
 #!/bin/bash
#SBATCH --job-name=array-job
#SBATCH --output=slurm-%A.%a.out
#SBATCH --error=slurm-%A.%a.err
#SBATCH --nodes=1
#SBATCH --ntasks=1
#SBATCH --cpus-per-task=1
#SBATCH --mem-per-cpu=4G
#SBATCH --time=00:01:00
#SBATCH --array=0-9%1
#SBATCH --mail-type=all
#SBATCH --mail-user=<username>@uni-a.de

if [[ <Insert check if your calculations is completed > ]] ; then
	exit 0
fi

srun ./myprogram