PRACE Course: HPC Code Optimisation Workshop 2021
Learning Goals
Through a sequence of simple, guided examples of code modernisation, the attendees will develop awareness on features of multi and many-core architecture which are crucial for writing modern, portable and efficient applications.
The workshop is a PRACE training event organised by LRZ in cooperation with Intel and NHR@FAU .
Preliminary Agenda
Session | |
1st day morning | Intro (Volker Weinberg) |
1st day afternoon | Intel Compiler & Vectorization (Igor Vorobtsov /Alina Shadrina) |
2nd day morning | Roofline Model (Jonathan Coles) |
2nd day afternoon | VTune (Michael Steyer) |
3rd day morning | LikWid (Carla Guillen/Thomas Gruber) |
3rd day afternoon | Performance Optimization of CPMD (Gerald Mathias) |
Presenters
Jonathan Coles, Gerald Mathias, Carla Guillen (LRZ)
Thomas Gruber (NHR@FAU)
Edmund Preiss, Alina Shadrina, Michael Steyer, Dmitry Tarakanov, Igor Vorobtsov (Intel)
Assistant
- Momme Allalen (LRZ)
Slides and Exercises
Day 1
Day 2
Day 3
Recommended Access Tools
- Exercises will be done on the CooLMUC2 Cluster @ LRZ with 28-way Haswell-based nodes and FDR14 Infiniband interconnect
- Please use your own laptop or PC with X11 support and an ssh client installed for the hands-on sessions.
Under Windows
- Install and run the Xming X11 Server for Windows: https://sourceforge.net/projects/xming/ and then install and run the terminal software putty: https://www.chiark.greenend.org.uk/~sgtatham/putty/latest.html
- Alternatively, we recommend to install the comfortable tool MobaXterm (https://mobaxterm.mobatek.net/download-home-edition.html) which also includes an X11 client.
- Under macOS
- Install X11 support for macOS XQuartz: https://www.xquartz.org/
- Under Linux
- ssh and X11 support comes with all distributions
Login under Windows:
- Start xming and after that PUTTY
- Enter host name lxlogin1.lrz.de into the putty host field and click Open.
- Accept & save host key [only first time]
- Enter user name and password (provided by LRZ staff) into the opened console.
Login under Mac:
- Install X11 support for MacOS XQuartz: https://www.xquartz.org/
- Open Terminal
- ssh -Y lxlogin1.lrz.de -l username
- Use user name and password (provided by LRZ staff)
Login under Linux:
- Open xterm
- ssh -Y lxlogin1.lrz.de -l username
- Use user name and password (provided by LRZ staff)
How to use the CoolMUC-2 System
Login Nodes:
- lxlogin1.lrz.de
- lxlogin2.lrz.de
- lxlogin3.lrz.de
- lxlogin4.lrz.de
Reservation is only valid during the workshop, for general usage on our Linux Cluster remove the "--reservation=hcow1w21
"
- Submit a job:
sbatch --reservation=hcow1w21 job.sh
- List own jobs:
squeue –M cm2_tiny
- Cancel jobs:
scancel -M cm2_tiny jobid
- Show reservations:
sinfo -M cm2_tiny --reservation
- Interactive Access:
module load salloc_conf/cm2_tiny
salloc --partition=cm2_tiny --time=00:30:00--reservation=hcow1w21
export OMP_NUM_THREADS=28
srun--reservation=hcow1w21
./myprog.exe
exit
or:srun --reservation=hcow1w21 --pty bash
Details: https://doku.lrz.de/display/PUBLIC/Running+parallel+jobs+on+the+Linux-Cluster
Examples: https://doku.lrz.de/display/PUBLIC/Example+parallel+job+scripts+on+the+Linux-Cluster
Resource limits: https://doku.lrz.de/display/PUBLIC/Resource+limits+for+parallel+jobs+on+Linux+Cluster
Example OpenMP Batch File
#!/bin/bash
#SBATCH -o /dss/dsshome1/0B/a2c06ae/test.%j.%N.out
#SBATCH -D /dss/dsshome1/0B/a2c06ae
#SBATCH -J test
#SBATCH --clusters=cm2_tiny
#SBATCH --partition=cm2_tiny
#SBATCH --nodes=
1
-
1
#SBATCH --cpus-per-task=
28
#SBATCH --get-user-env
#SBATCH --reservation=hcow1w21
#SBATCH --time=02:00:00
module load slurm_setupexport OMP_NUM_THREADS=
$SLURM_CPUS_PER_TASK
./myprog.exe
Intel Software Stack
The Intel software stack is automatically loaded at login. The Intel compilers are called icc (for C), icpc (for C++) and ifort (for Fortran). They behave similar to the GNU compiler suite (option –help shows an option summary). For reasonable optimisation including SIMD vectorisation, use options -O3 -xavx (you can use -O2 instead of -O3 and sometimes get better results, since the compiler will sometimes try be overly smart and undo many of your hand-coded optimizations).
By default, OpenMP directives in your code are ignored. Use the -qopenmp option to activate OpenMP.
Use mpiexec -n #tasks to run MPI programs. The compiler wrappers' names follow the usual mpicc, mpifort, mpiCC pattern.
Intel OneAPI
The most recent version of the Intel software stack "Intel OneAPI" can be loaded with
uid@cm2login1:~> module load intel-oneapi intel-oneapi-mpi: using intel wrappers for mpicc, mpif77, etc Loading intel-oneapi/2021.4 Unloading conflict: intel-mpi/2019-intel intel/19.0.5 intel-mkl/2019 Loading requirement: intel-oneapi-compilers/2021.4.0 intel-oneapi-mkl/2021 intel-oneapi-mpi/2021-intel intel-oneapi-itac/2021.4.0 uid@cm2login1:~> module list Currently Loaded Modulefiles: 1) admin/1.0 2) tempdir/1.0 3) lrz/1.0 4) spack/21.1.1 5) intel-oneapi-compilers/2021.4.0 6) intel-oneapi-mkl/2021 7) intel-oneapi-mpi/2021-intel 8) intel-oneapi-itac/2021.4.0 9) intel-oneapi/2021.4 uid@cm2login1:~> module av intel-oneapi -------------- /lrz/sys/spack/.oneapi_rebuild/modules/x86_64/linux-sles15-x86_64 --------------- intel-oneapi-advisor/2021.4.0 intel-oneapi-ipp/2021.4.0 intel-oneapi-mkl/2021.3.0 intel-oneapi-ccl/2021.4.0 intel-oneapi-ippcp/2021.4.0 intel-oneapi-mkl/2021.4.0 intel-oneapi-clck/2021.4.0 intel-oneapi-itac/2021.4.0 intel-oneapi-mpi/2021-gcc intel-oneapi-compilers/2021.4.0 intel-oneapi-mkl/2021 intel-oneapi-mpi/2021-intel intel-oneapi-dal/2021.4.0 intel-oneapi-mkl/2021-gcc8 intel-oneapi-tbb/2021.4.0 intel-oneapi-dnn/2021.4.0 intel-oneapi-mkl/2021-seq intel-oneapi-vpl/2021.6.0 intel-oneapi-dpcpp-ct/2021.4.0 intel-oneapi-mkl/2021.1.1 intel-oneapi-vtune/2021.7.1 intel-oneapi-inspector/2021.4.0 intel-oneapi-mkl/2021.2.0
Upon loading the main intel-oneapi
module, the default modules intel
, intel-mpi
, and intel-mkl
are unloaded and replaced by the intel-oneapi-*
variants. Further intel-oneapi-xxx
modules are available via the module command.
PRACE Survey
Please fill out the PRACE online survey under
https://events.prace-ri.eu/event/1268/surveys/871
This helps us and PRACE to
increase the quality of the courses,
design the future training programme at LRZ and in Europe according to your needs and wishes,
get future funding for training events.