Flux Framework - Flux in Slurm
Getting started ...
What is it?
Flux Framework is a task scheduling and resource management framework - much like Slurm. However, it can be run completely in user space. And we describe it here as an alternative to Slurm's srun task farming capabilities.
Flux is rather versatile, but also quite complex - and still under very active development. We must therefore refer to the flux documentation for all the details left out here.
Using LRZ Module
> module av flux-core ------------------ /lrz/sys/share/modules/files_sles15/tools ------------------------- flux-core/0.63.0 flux-core/0.64.0 > module load flux-core
Own Installation
The simplest installation is probably via conda.
> conda create -n my_flux -c conda-forge flux-core flux-sched > conda activate my_flux (my_flux) > flux version commands: 0.64.0 libflux-core: 0.64.0 build-options: +hwloc==2.8.0+zmq==4.3.5
If you need a more up-to-date version of flux, you probably can't get around to build it from source (https://github.com/flux-framework/). But spack may help you to simplify that process.
Another option to install flux-core is Spack (user_spack). However, in order to get the latest version, manual manipulation of the package will be necessary.
flux-sched is not necessary. A simple scheduler is always available. But fluxion scheduler is supposed to be better than that. We could not find applications schenarios that convinced us.
Interactive Workflows
Real interactive work with Flux is probably not so reasonable. But for testing purposes, and as sort of a starting point, let's have a short look at it. We start from a login node.
login > module load flux-core # or, activate activate flux environment (my_flux) login > srun -N 2 -M inter -p cm2_inter --pty flux start # allocate resources (on cluster/partition you want) i22r07c05s05 > flux uptime # basic info about the running flux instance 14:11:57 run 7.9s, owner ⼌⼌⼌⼌⼌⼌⼌, depth 0, size 2 i22r07c05s05 > flux resource info # basic info about the resources managed by the flux instance 2 Nodes, 56 Cores, 0 GPUs i22r07c05s05 > flux run --label-io -N2 hostname # run a task (here, on each node one) 0: i22r07c05s05 1: i22r07c05s08 i22r07c05s05 > flux bulksubmit --output=log.{{id}} -n 1 -c 7 /lrz/sys/tools/placement_test_2021/bin/placement-test.omp_only -t 7 -d 20 ::: $(seq 0 100) ƒCF6D7Bu # flux job IDs [...] i22r07c05s05 > flux jobs -a JOBID USER NAME ST NTASKS NNODES TIME INFO [...] ƒCL2LiaU ⼌⼌⼌⼌⼌⼌⼌ placement+ S 1 - - ƒCGVkRgt ⼌⼌⼌⼌⼌⼌⼌ placement+ R 1 1 8.580s i22r07c05s05 ƒCGVkRgs ⼌⼌⼌⼌⼌⼌⼌ placement+ R 1 1 10.15s i22r07c05s11 ƒCGUGSQa ⼌⼌⼌⼌⼌⼌⼌ placement+ R 1 1 12.45s i22r07c05s11 ƒCGUGSQZ ⼌⼌⼌⼌⼌⼌⼌ placement+ R 1 1 12.45s i22r07c05s11 ƒCGUGSQY ⼌⼌⼌⼌⼌⼌⼌ placement+ R 1 1 12.79s i22r07c05s05 ƒCGUGSQX ⼌⼌⼌⼌⼌⼌⼌ placement+ R 1 1 13.35s i22r07c05s11 ƒCGSnT8C ⼌⼌⼌⼌⼌⼌⼌ placement+ R 1 1 14.15s i22r07c05s05 ƒCGSnT8B ⼌⼌⼌⼌⼌⼌⼌ placement+ R 1 1 17.15s i22r07c05s05 ƒCG62dBP ⼌⼌⼌⼌⼌⼌⼌ placement+ CD 1 1 23.41s i22r07c05s05 ƒCG62dBQ ⼌⼌⼌⼌⼌⼌⼌ placement+ CD 1 1 19.54s i22r07c05s11 ƒCG62dBM ⼌⼌⼌⼌⼌⼌⼌ placement+ CD 1 1 20.68s i22r07c05s11 [...] i22r07c05s05 > exit
flux
has an elaborate direct help system. Please use flux help
and flux help <command>
to acquire some information or reminder.
flux submit/bulksubmit
, flux cancel <job ID>
and flux job -a
can be used similarly to sbatch
, scancel
and squeue
under Slurm. Maybe flux cancelall -f
is a highlight in the first tests.
Non-Interactive Workflows
The far more normal approach to use flux is probably to have a bunch of tasks that should be bundled with a Slurm job. This comprises already the maximum scope of possible workflows, which we cannot cover here at all. But an example should illustrate the basic principle.
With srun
, the flux instances are started (one process per node), and also just with a script – workflow.sh
. This script contains the actual flux workflow description. We use here some dummy programs which provide us with information about the rank/thread-to-cpu placement. It is probably a good idea to check the correctness of that.
This Slurm script is to be submitted as usual via sbatch
.
NB: We tested here Intel MPI, where flux run
works remarkably well concerning the rank/thread placement.
Remark: We found that the srun
option --mpi=none
worked on one cluster, on another, it needed to be --mpi=pmi2
. We could not exactly figure out what the deeper background is. But we suspect that subtle differences in the Slurm version or configuration may cause that. Please try out what works. Or ask for help in our Service Desk.
Waitable Jobs
In general, flux submit
would submit a job, and return to the shell. Specfically, in mass jobs within a Slurm job, that would lead to that the workflow-script above would then just return after the submission of the last flux job. To handle this, flux submit
knows the option --flags=waitable
. Together with a subsequent flux job wait --all
, we have a similar idiom like the srun &; wait
for Slurm job farming. However, the flux documentation claims that flux job wait
is much more lightweight than bash wait
.
Dependency Trees
flux submit
also knows job dependencies via --dependency=...
option. Here, ... can for instance be afterok:JOBID
. That is sematically equal to Slurm's sbatch
job dependencies.
After Slurm Job stops
flux seems not to have a job-bookkeeping device. So, some automatic restart from a certain state of progress in a job is probably not directly possible.
But flux queue
offers some capabilities to document/archive the flux's queue status. Please check the cheat sheet below.
# Stop the queue, wait for running jobs to finish, and dump an archive. flux queue stop flux queue idle flux dump ./archive.tar.gz
In order to execute that in a Slurm job, maybe some bash trap ... EXIT
is necessary (where ... is some cleanup bash function).
Also, the following ideom can output valuable information about the flux jobs (please use flux jobs -o 'help'
, or, check the docu):
flux job wait --all flux jobs -a -o '{id} {username} {ncores} {nnodes} {nodelist} {t_run} {t_cleanup} {runtime}'
Further Reading
Flux Framework comes with a vast scope of documentation, user guides and tutorials. We propose to beginners to start with the Learning Guide.
To bind Flux into a Slurm frame, please consult the docu on that.
As a good overview, the Cheat Sheet is of tremendous help.