General
R is a free software environment for statistical computing and graphics. It is a modular code interpreter and can be extended via packages from the Comprehensive R Archive Network (CRAN).
Recommended read for all R users for an overview of R's HPC related packages grouped by topic: CRAN Task View: High-Performance and Parallel Computing with R
Modules providing R
Modules providing R can be found via
module spider R
The "*-cf" versions are actually a conda environments but without any conda functionality exposed to the user. The packages are from the popular conda-forge
repository and it is optimized for haswell
( –mtune=haswell
) processors, which is also suitable for Zen3 AMD processors. In addition all BLAS and LAPACK calls are handled by optimized Intel MKL libraries ( v2023.2.1
).
For details regarding compilations flags, see $R_HOME/etc/Makeconf
after loading the module.
module load R/4.3.3
This module also does a bit more work under the hood. It sets the following environment variables :
R_LIBS_USER=/hpc/gpfs2/scratch/u/$USER/.R/4.3
, so that users may install additional CRAN packages themselves or override existing packages with different versions. The directory will be created automatically by the module.TMPDIR=/hpc/gpfs2/scratch/u/$USER/.Rtmp/4.3
, so that users temporary files don't accumulate or fill up the global/tmp
folder on the login nodes. The directory will be created automatically by the module.OMP_NUM_THREADS=1
, so that implicit multithreading via OpenMP threading is disabled by default. If this variable would be unset, then each R process might use as much threads (typically 128) as available on a node, even if less is requested/allocated. If R is then parallelized via process forking (parallel
package) , then the nodes could get easily overloaded.- various other environment variables needed for dependencies.
Installed packages
Installing additional packages
Ticket via HPC-Service
This is the preferred method if the module in question has been published on CRAN, since the package will be available to everyone using the available R modules. For more information on how to open a Ticket see Service and Support.
Self service
The R modules set R_LIBS_USER
to a directory in your scratch area, e.g. /hpc/gpfs2/scratch/u/$USER/.R/4.3
, where packages will be installed. This typically works well unless your package requires dependencies like additional libraries. In this case (the installation will fail) open a Ticket!
CRAN packages are case sensitive!
install.packages("BlA")
BiocManager::install("BlA")
Packages installed by Users (located in R_LIBS_USER
) will always be preferred. If you require a newer (or different) version of a package that is already installed centrally, you can just install it yourself.
Performance considerations and common Pitfalls
Implicit multithreading
R is able to use implicit multithreading using a subset of optimized functions or functions taking advantage of parallelized BLAS/LAPACK routines. This is controlled by the environment variable OMP_NUM_THREADS
which is set to 1
by default. If you know your code benefits from this, then you can increase it manually either within R or by changing the environment variable explicitly (and/or reducing the number of worker or MPI processes at the same time). Running benchmarks is mandatory when using this and don't expect miracles.
Using parallel::detectCores()
This will detect the number of CPU cores of the entire Compute-Node (typically 128), irrespective of the number of cores you requested via Slurm. When Slurm has allocated less than a full node for your job, this will lead to Node overloading (spawning more threads/processes than CPU cores allocated) and consequently inefficient Jobs. Therefore, do not use parallel::detectCores()
at all on HPC clusters, always use either parallelly::availableCores()
or future::availableCores()
to determine the correct number of cores available in single-Node Jobs.
ncpus <- parallelly::availableCores() options(mc.cores = ncpus) # set a global option for parallel packages res<-mclapply(... ,mc.cores = ncpus)) # or set the number of cores per call
Poor scaling of parallel code
Don't expect your code to work well automatically if you just scale up the numer of CPU cores. The Job effeciency often drops with increasing amount of CPU cores and in some cases it may take even longer when using too much CPU cores.
Beware submitting large R jobs
Before you start submitting large jobs to the cluster, measure the parallel efficiency first. Parallel efficiency (actual scaling vs. ideal scaling) should be well above 50%! Failure to do so may result in official warnings and in extreme and repetitive cases to account suspension.
R is using more threads than CPU cores available
This is a common problem as setting OMP_NUM_THREADS=1
is no silver bullet to catch them all. Some R packages do not respect it and still launch 128 threads per process. Packages known for this behaviour:
package | solution |
---|---|
ranger | add export R_RANGER_NUM_THREADS=1 to your Slurm script after loadthing the R module. Details. |
randomForestSRC | add export MC_CORES=1 and export RF_CORES=1 to your Slurm script after loadthing the R module. Details. |
If you encoutner a similar problem with other packages and you know the solution, let is know so that everyone can benefit by adding it to the list above.
Installing packages in a Job does not work
This is to be excepted when installing from sources that require internet access, which the compute nodes don't have. In general it is a bad idea to install packages from a production computation. Instead, install packages either interactively oder via an R-script on the login node. Due to the GPFS, the package will be available on all compute nodes immediately.
MPI with R and Slurm
There are several packages that allow for MPI parallelization with R:
- Rmpi (package that implements the low level MPI interface)
- pbdMPI (S4 classes to directly interface MPI in order to support the Single Program/Multiple Data (SPMD) parallel programming style)
- doMPI (high level package that enhances foreach with MPI, uses all requested CPU cores excluding the control process. E.g. if you request 16 MPI process, only 15 of them will be able to do the heavy lifting.)
There is one important caveat in using R with Slurm (mandatory): since Slurm is taking care of spawning MPI processes (requested processes are running throughout the lifetime of your Slurm Job), you cannot spawn the processes by yourself or dynamically by calling any of the mpi.spawn.*
functions .
Private installations
Almost always there is no need for private installations. If you still think you need one, continue reading.
From conda-forge
Most R packages available on CRAN are also available via conda-forge. In fact, we use conda-forge for centrally managed R installations as well.
module load micromamba micromamba create -n myspecialRenv -c conda-forge r-base==4.3.3 ... micromamba activate myspecialRenv micromamba install -c conda-forge r-morepackages
Self compilation
Although not recommended or supported, you may simply compile R yourself by loading an appropriate compiler module. You will have to manage dependencies yourself though.
Slurm Job templates
#!/bin/bash #SBATCH --time=00:20:00 #SBATCH --partition=epyc #SBATCH --tasks=1 #SBATCH --cpus-per-task=16 #SBATCH --mem-per-cpu=2G # Load the version of R you want to use module purge module load R # Run your R script srun Rscript test.R
#!/bin/bash #SBATCH --time=00:20:00 #SBATCH --partition=epyc #SBATCH --tasks-per-node=16 #SBATCH --nodes=2 #SBATCH --cpus-per-task=1 #SBATCH --mem-per-cpu=2G # Load the version of R you want to use module purge module load R # Run your R script srun Rscript test.R