General Information
ORCA is a flexible, efficient and easy-to-use general purpose tool for quantum chemistry with specific emphasis on spectroscopic properties of open-shell molecules. It features a wide variety of standard quantum chemical methods ranging from semiempirical methods to DFT to single- and multireference correlated ab initio methods. It can also treat environmental and relativistic effects. On 25th of July, ORCA 6.0.0 was released which is a major step forward in speed and efficiency. While ORCA v5 series modules are still available, we encourage all users to switch to ORCA v6.
Documentation and Tutorials are readily available online.
Licensing
Orca can be used only for academic purposes. By using ORCA, you are agreeing to the End User License Agreement (EULA). The HPC-Team has licensed ORCA in order to provide ORCA to its users.
Obtaining Access
According to the EULA, this license is not transferable or sublicensable. We therefore have to require every user to register at the ORCA forums and provide proof of this registration. Users need to:
- Register at the ORCA forums, and obtain a license by agreeing to the EULA. You will receive an email with the subject
Confirmation ORCA Registration
. - Forward this email (as attachment!) to the team of IT-Physik together with the following information:
- your RZ-Benutzerkennung
- your numeric User-ID (not username!) at the ORCA forums.
After verifying your registration, you will be granted access to ORCA modules and executables.
Running ORCA
Lmod modules
ml spider orca
Starting from orca/6
, the following environment variables will be set:
ORCAPATH
, path containing the ORCA binaries.ORCATMPDIR
, path defining temporary storage, either/dev/shm
(default) or/tmp
, depending on the setting ofTMPDIRMODE
(see example Slurm Job template).ORCA_HEADNODE
, name of the head node where the orca executable was called.RSH_COMMAND
, command to connect to remote nodes when group parallelization is used (set torsh
by the module).
Do not call orca with srun
Using srun
to call orca will result in a job failure. ORCA consists of a multitude of different executables. The lightweight orca
binary calls these executables for you in the correct order.
This works well on workstations but does not integrate well with queue systems like Slurm. Since these calls are hardcoded and direct SSH access to nodes is not allowed, we need to mock the mpirun
and rsh
executables to alter the commands and do the right thing.
Slurm Job template
#!/bin/bash #SBATCH --job-name=runorca #SBATCH --partition=epyc #SBATCH --ntasks=8 #SBATCH --ntasks-per-node=8 #SBATCH --mem-per-cpu=4G #SBATCH --time=7-0 # By default, ORCATMPDIR is set to a Ramdisk within a Slurm Job # To use local SSD for ORCATMPDIR, set # export TMPDIRMODE=SSD # To force using GPFS for ORCATMPDIR, set (heavily discouraged!) # export TMPDIRMODE=GPFS ml purge ml orca/6 export JOB=${1/\.inp/} export WORKDIR=$(pwd) shopt -s extglob # Copy inputfiles to all nodes srun -n$SLURM_NNODES -N$SLURM_NNODES cp *.!(out|log|err) "$ORCATMPDIR" cd "$ORCATMPDIR" # We need to call orca using its full path! $ORCAPATH/orca ${JOB}.inp >> "${WORKDIR}/${JOB}.out" ORCAEXITCODE=$? cp *.!(out|log|err) "$WORKDIR" exit $ORCAEXITCODE
Make sure that the number of --ntasks
is in sync with the respective sections in the input file. Usage of --ntasks
in combination with --ntasks-per-node
is recommended for ORCA in order to avoid mistakes.
The job will be then started using the inputfile as a commandline argument:
sbatch orca6.sl sample.inp
Note on temporary data
By default, the module will use local ramdisk to store temporary data. Make sure that you are requesting enough memory. Due to the architecture of ORCA, data is exchanged via temporary files exclusively, involving large amounts of files and data. Using the GPFS as a temporary directory will cause severe slowndowns of your calculations. It is therefore heavily discouraged to use the GPFS as a temporary directory!
Parallelization
ORCA is parallelized via MPI exclusively. The underlying openBLAS library is single threaded. Setting --cpus-per-task
> 1 will therefore not improve the performance of your calculations in any way.
Using too much CPU cores can sometimes slow down calculations. For larger studies it is therefore recommened to check the efficiency scaling (for example by limiting the number of SCF cycles).
It is absolutely crucial that --ntasks
of the Slurm script matches the respective section of the input file:
%pal nprocs 8 end
or one of the PALX
keywords.
Group parallelization
Starting from orca/6
there is group parallelization (splitting up the problem into independent task groups) for the following type of calculations:
all NumCalc-methods: as NumGrad, NumFreq, VPT2, Overtones, NEB, and GOAT.
the analytical Hessian (speedup especially for very large calculations).
%pal nprocs 8 # total number of parallel processes nprocs_group 2 # number of parallel processes per sub-task end
Notes on the choice of the number of CPU cores per group
nprocs_group
needs to be an integer divisor of both --ntasks
and --ntasks-per-node
Multinode parallization
Multinode parallelization is needed when more than 128 CPU cores are needed. This should work out of the box for most types of calculations.
Interfacing with xtb
ORCA can interface with xtb. Starting with orca/6
this works without loading any other module.
Calculations involving xtb are limited to a single node. When you request more than one node, the program will error out immediately.
Support
If you have any problems with ORCA please contact the team of IT-Physik (preferred) or the HPC-Servicedesk.
Also, if you have improvements to this documentation that other users can profit from, please reach out!