General

R is a free software environment for statistical computing and graphics. It is a modular code interpreter and can be extended via packages from the Comprehensive R Archive Network (CRAN).

Recommended read for all R users for an overview of R's HPC related packages grouped by topic: CRAN Task View: High-Performance and Parallel Computing with R

Modules providing R

Modules providing R can be found via

module spider R

The "*-cf" versions are actually a conda environments but without any conda functionality exposed to the user. The packages are from the popular conda-forge repository and it is optimized for haswell ( –mtune=haswell ) processors, which is also suitable for Zen3 AMD processors. In addition all BLAS and LAPACK calls are handled by optimized Intel MKL libraries ( v2023.2.1 ).

For details regarding compilations flags, see $R_HOME/etc/Makeconf after loading the module.

Loading the module

module load R/4.3.3

This module also does a bit more work under the hood. It sets the following environment variables :

R_LIBS_USER=/hpc/gpfs2/scratch/u/$USER/.R/4.3 , so that users may install additional CRAN packages themselves or override existing packages with different versions. The directory will be created automatically by the module.
TMPDIR=/hpc/gpfs2/scratch/u/$USER/.Rtmp/4.3 , so that users temporary files don't accumulate or fill up the global /tmp folder on the login nodes. The directory will be created automatically by the module.
OMP_NUM_THREADS=1 , so that implicit multithreading via OpenMP threading is disabled by default. If this variable would be unset, then each R process might use as much threads (typically 128) as available on a node, even if less is requested/allocated. If R is then parallelized via process forking ( parallel package) , then the nodes could get easily overloaded.
various other environment variables needed for dependencies.

Installed packages

package (case insensitive)	R/4.3.3-cf
abind	1.4-5
acepack	1.4.2
ade4	1.7-22
adehabitatHR	0.4.21
adehabitatLT	0.3.27
adehabitatMA	0.3.16
AER	1.2-12
alphavantager	0.1.3
Amelia	1.8.2
anytime	0.3.9
ape	5.8
aplot	0.2.3
arrow	16.1.0
askpass	1.2.0
assertthat	0.2.1
aTSA	3.1.2.1
backports	1.5.0
base64enc	0.1-3
base64url	1.4
BatchJobs	1.9
batchtools	0.9.17
bayesm	3.1-6
bayestestR	0.13.2
BBmisc	1.13
bdsmatrix	1.3-7
bench	1.1.3
BH	1.84.0-0
bigmemory	4.6.4
bigmemory.sri	0.1.8
bindr	0.1.1
bindrcpp	0.2.3
binom	1.1-1.1
BiocManager	1.30.23
BiocVersion	3.18.1
biomod2	4.2-5-2
bit	4.0.5
bit64	4.0.5
bitops	1.0-7
blob	1.2.4
bookdown	0.40
brew	1.0-10
brio	1.1.5
broom	1.0.6
bslib	0.7.0
cachem	1.1.0
callr	3.7.6
car	3.1-2
carData	3.0-5
caret	6.0-94
carfima	2.0.2
caTools	1.18.2
cba	0.2-24
cellranger	1.1.0
checkmate	2.3.1
chk	0.9.2
chron	2.3-61
CircStats	0.2-6
classInt	0.4-10
cli	3.6.3
cliapp	0.1.2
climdex.pcic	1.1-11
ClimProjDiags	0.3.3
clipr	0.8.0
clock	0.7.0
cmprsk	2.2-12
coda	0.19-4.1
colorspace	2.1-0
common	1.1.3
commonmark	1.9.1
conflicted	1.2.0
conquer	1.3.3
corpcor	1.6.10
covr	3.6.4
cpp11	0.4.7
crayon	1.5.3
credentials	2.0.1
crosstalk	1.2.1
crul	1.4.2
cubature	2.0.4.6
curl	5.2.1
cvar	0.5
CVST	0.2-3
DALEX	2.4.3
data.table	1.15.4
data.tree	1.1.0
datawizard	0.12.0
date	1.2-42
DBI	1.2.3
DBItest	1.7.3
dbplyr	2.5.0
ddalpha	1.3.15
decor	1.0.2
deldir	2.0-4
DEoptim	2.2-8
DEoptimR	1.1-3
desc	1.4.3
deSolve	1.40
devtools	2.4.5
diagram	1.6.5
DiagrammeR	1.0.11
dials	1.2.1
DiceDesign	1.10
dichromat	2.0-0.1
diffobj	0.3.5
digest	0.6.36
dimRed	0.2.6
dismo	1.3-14
docopt	0.7.1
doFuture	1.0.1
doMC	1.3.8
doMPI	0.2.2
doParallel	1.0.17
doSNOW	1.0.20
dotCall64	1.1-1
downlit	0.4.4
downloader	0.4
dplyr	1.1.4
dqrng	0.3.2
DRR	0.0.4
DT	0.33
dtplyr	1.3.1
dygraphs	1.1.1.6
dynlm	0.3-6
e1071	1.7-14
earth	5.3.3
easyNCDF	0.1.2
ecodist	2.1.3
ECOSolveR	0.5.5
ecospat	4.1.1
effects	4.2-2
effectsize	0.8.9
ei	1.3-3
eiPack	0.2-2
ellipse	0.5.0
ellipsis	0.3.2
energy	1.7-11
estimability	1.5
evaluate	0.24.0
evd	2.3-7
expm	0.999-9
fansi	1.0.6
farver	2.1.2
fAssets	4023.85
fastDummies	1.7.3
fastICA	1.2-4
fastmap	1.2.0
fastmatch	1.1-4
fasttime	1.1-0
fBasics	4032.96
fCopulae	4022.85
fExtremes	4032.84
fGarch	4033.92
fields	15.2
filehash	2.4-6
FKF	0.2.5
FKF.SP	0.3.1
float	0.3-2
fMultivar	4031.84
FNN	1.1.4
fontawesome	0.5.2
fontBitstreamVera	0.1.1
fontLiberation	0.1.0
fontquiver	0.2.1
forcats	1.0.0
foreach	1.5.2
forecast	8.23.0
formatR	1.14
Formula	1.2-5
fracdiff	1.5-3
fs	1.6.4
fts	0.9.9.2
furrr	0.3.1
future	1.33.2
future.apply	1.11.2
future.batchtools	0.12.1
future.callr	0.8.2
gam	1.22-3
gargle	1.5.2
gbm	2.1.9
gbRd	0.4.12
gbutils	0.5
gdata	3.0.0
geepack	1.3.11
generics	0.1.3
geometry	0.4.7
geoR	1.9-4
geosphere	1.5-18
gert	2.0.1
ggforce	0.4.2
ggfun	0.1.5
ggplot2	3.5.1
ggplotify	0.1.2
ggraph	2.1.0
ggrepel	0.9.5
ggtree	3.10.1
gh	1.4.1
gistr	0.9.0
gitcreds	0.1.2
glmnet	4.1-8
globals	0.16.3
glue	1.7.0
gmm	1.8
gmodels	2.19.1
gmp	0.7-4
goftest	1.2-3
googledrive	2.1.1
googlesheets4	1.1.1
gower	1.0.1
GPfit	1.0-8
gplots	3.1.3.1
graphlayouts	1.1.0
gridExtra	2.3
gridGraphics	0.5-1
gsl	2.1-8
gss	2.2-7
gtable	0.3.5
gtools	3.9.5
h2o	3.44.0.3
hardhat	1.4.0
haven	2.5.4
hdf5r	1.3.10
here	1.0.1
hexbin	1.28.3
highr	0.11
Hmisc	5.1-3
hms	1.1.3
htmlTable	2.4.2
htmltools	0.5.8.1
htmlwidgets	1.6.4
httpcode	0.3.0
httpuv	1.6.15
httr	1.4.7
httr2	1.0.1
iBreakDown	2.1.2
ids	1.0.1
igraph	2.0.3
infer	1.0.7
influenceR	0.1.5
ingredients	2.3.0
ini	0.3.1
insight	0.20.2
interp	1.1-6
invgamma	1.1
ipred	0.9-14
IRdisplay	1.1
IRkernel	1.3.2
irlba	2.3.5.1
Iso	0.0-21
isoband	0.2.7
iterators	1.0.14
jpeg	0.1-10
jquerylib	0.1.4
jsonlite	1.8.8
kernelshap	0.6.0
kernlab	0.9-32
knitr	1.48
ks	1.14.2
labeling	0.4.3
lambda.r	1.2.4
LaplacesDemon	16.1.6
later	1.3.2
latticeExtra	0.6-30
lava	1.7.3
lazyeval	0.2.2
leaps	3.2
lgr	0.4.4
lhs	1.2.0
lifecycle	1.0.4
linprog	0.9-4
listenv	0.9.1
littler	0.3.20
lme4	1.1-35.5
lmerTest	3.1-3
lmtest	0.9-40
lobstr	1.1.2
locfit	1.5-9.9
logcondens	2.1.8
logr	1.3.8
lpSolve	5.6.20
lubridate	1.9.3
lwgeom	0.2-14
magic	1.6-1
magrittr	2.0.3
mapdata	2.3.1
mapproj	1.2.11
maps	3.4.2
maptools	1.1-8
markdown	1.13
Matching	4.10-14
MatchIt	4.5.5
mathjaxr	1.6-0
matrixcalc	1.0-6
MatrixModels	0.5-3
matrixStats	1.3.0
maxLik	1.5-2.1
maxnet	0.1.4
mclust	6.1.1
mcmc	0.9-8
MCMCglmm	2.36
MCMCpack	1.7-0
mda	0.5-4
memoise	2.0.1
Metrics	0.1.4
mets	1.3.3
mime	0.12
miniUI	0.1.1.1
minqa	1.2.7
misc3d	0.9-1
miscTools	0.6-28
mitools	2.4
mlbench	2.1-5
MLmetrics	1.1.3
mlr3	0.20.0
mlr3measures	0.5.0
mlr3misc	0.15.1
mnormt	2.1.1
MNP	3.1-5
mockery	0.4.4
mockr	0.2.1
mockthat	0.2.8
modeldata	1.4.0
modelenv	0.1.1
ModelMetrics	1.2.2.2
modelr	0.1.11
msm	1.7.1
multcomp	1.4-25
multiApply	2.1.4
multicool	1.0.1
munsell	0.5.1
mvnormtest	0.1-9-3
mvtnorm	1.2-5
nabor	0.5.0
ncdf4	1.22
ncdf4.helpers	0.3-6
nloptr	2.0.3
NLP	0.2-1
nortest	1.0-4
numDeriv	2016.8-1.1
openssl	2.2.0
openxlsx	4.2.5.2
padr	0.6.2
palmerpenguins	0.1.1
paradox	1.0.1
parallelly	1.37.1
parallelMap	1.5.1
parameters	0.22.0
parsnip	1.2.1
pastecs	1.4.2
patchwork	1.2.0
pbdMPI	0.5-1
pbdZMQ	0.3-11
pbkrtest	0.5.2
pbmcapply	1.5.1
PCICt	0.5-4.4
pec	2023.04.12
performance	0.12.0
PerformanceAnalytics	2.0.4
permute	0.9-7
pillar	1.9.0
pixmap	0.4-13
pkgbuild	1.4.4
pkgconfig	2.0.3
pkgdown	2.1.0
pkgKitten	0.2.3
pkgload	1.3.4
plogr	0.2.0
plot3D	1.4.1
plotly	4.10.4
plotmo	3.6.3
plotrix	3.8-4
pls	2.8-3
plyr	1.8.9
png	0.1-8
poibin	1.5
polspline	1.1.25
polyclip	1.10-6
pracma	2.4.4
praise	1.0.0
PresenceAbsence	1.1.11
prettycode	1.1.0
prettyunits	1.2.0
pROC	1.18.5
processx	3.8.4
prodlim	2024.06.25
profmem	0.6.0
profvis	0.3.8
progress	1.2.3
progressr	0.14.0
promises	1.3.0
proxy	0.4-27
PRROC	1.3.1
pryr	0.1.6
ps	1.7.7
pscl	1.5.9
psy	1.2
Publish	2023.01.17
purrr	1.0.2
qtl	1.66
quadprog	1.5-8
Quandl	2.11.0
quantmod	0.4.26
quantreg	5.98
R.methodsS3	1.8.2
R.oo	1.26.0
R.utils	2.12.3
R6	2.5.1
ragg	1.3.2
RandomFields	3.3.14
RandomFieldsUtils	1.2.5
randomForest	4.7-1.1
randomForestSRC	3.3.0
ranger	0.16.0
RANN	2.6.1
rappdirs	0.3.3
raster	3.6-26
rbibutils	2.2.16
rbokeh	0.5.2
rcmdcheck	1.4.0
RcmdrMisc	2.9-1
RColorBrewer	1.1-3
Rcpp	1.0.12
RcppAnnoy	0.0.22
RcppArmadillo	14.0.0-1
RcppEigen	0.3.4.0.0
RcppParallel	5.1.6
RcppProgress	0.4.2
RcppRoll	0.3.1
RcppTOML	0.2.2
RCurl	1.98-1.16
Rdpack	2.6
readr	2.1.5
readstata13	0.10.1
readxl	1.4.3
recipes	1.1.0
relimp	1.0-5
rematch	2.0.0
rematch2	2.1.2
remotes	2.5.0
repr	1.1.7
reprex	2.1.1
reshape	0.8.9
reshape2	1.4.4
reticulate	1.38.0
rex	1.2.1
rgdal	1.6-7
rgenoud	5.9-0.10
rgexf	0.16.3
rgl	1.3.1
Rglpk	0.6-5.1
RhpcBLASctl	0.23-42
riingo	0.3.1
rio	1.1.1
riskRegression	2023.12.21
rlang	1.1.4
rlist	0.4.6.2
RLRsim	3.1-8
rmarkdown	2.27
Rmpi	0.7-2
rms	6.8-1
RMySQL	0.10.27
rneos	0.4-0
RNetCDF	2.9-2
robustbase	0.99-3
ROCR	1.0-11
RODBC	1.3-23
roxygen2	7.3.2
RPostgreSQL	0.7-6
rprojroot	2.0.4
rsample	1.2.1
rslurm	0.6.2
Rsolnp	1.16
RSpectra	0.16-1
RSQLite	2.3.4
rstudioapi	0.16.0
RUnit	0.4.33
rversions	2.1.2
rvest	1.0.4
s2	1.1.6
sandwich	3.1-0
sass	0.4.9
scales	1.3.0
selectr	0.4-2
sendmailR	1.4-0
servr	0.30
sessioninfo	1.2.2
setRNG	2024.2-1
sf	1.0-16
sfsmisc	1.1-18
shape	1.4.6.1
shiny	1.8.1.1
simr	1.0.7
sitmo	2.0.2
slam	0.1-50
slider	0.3.1
sm	2.2-6.0
sn	2.1.1
snow	0.4-4
SnowballC	0.7.1
snowfall	1.84-6.3
sodium	1.3.1
sourcetools	0.1.7-1
sp	2.1-4
spam	2.10-0
SparseM	1.84
spatstat	3.0-7
spatstat.core	2.4-4
spatstat.data	3.1-2
spatstat.explore	3.2-6
spatstat.geom	3.2-9
spatstat.linnet	3.1-4
spatstat.model	3.2-10
spatstat.random	3.2-3
spatstat.sparse	3.1-0
spatstat.utils	3.0-5
spData	2.3.1
spdep	1.3-5
splancs	2.01-45
splines2	0.5.3
SQUAREM	2021.1
stabledist	0.7-1
statmod	1.5.0
stringdist	0.9.12
stringi	1.8.4
stringr	1.5.1
strucchange	1.5-3
SuppDists	1.1-9.7
survex	1.2.0
survey	4.4-2
svglite	2.1.3
sys	3.4.2
systemfit	1.1-30
systemfonts	1.1.0
tcltk2	1.2-11
TeachingDemos	2.13
tensor	1.5
tensorA	0.36.2.1
terra	1.7-78
testit	0.13
testthat	3.2.1.1
textshaping	0.4.0
TH.data	1.1-2
this.path	2.5.0
tibble	3.2.1
tidygraph	1.3.0
tidymodels	1.2.0
tidyquant	1.0.7
tidyr	1.3.1
tidyselect	1.2.1
tidytree	0.4.6
tidyverse	2.0.0
timechange	0.3.0
timeDate	4032.109
timereg	2.0.5
timeSeries	4032.109
timetk	2.9.0
tinytex	0.51
tkrplot	0.0-27
tm	0.7-13
tmvnsim	1.0-2
tmvtnorm	1.6
treeio	1.26.0
triebeard	0.4.1
truncdist	1.0-2
truncnorm	1.0-9
TSA	1.3.1
tseries	0.10-56
tsfeatures	1.1.1
TTR	0.24.4
tune	1.2.1
tweenr	2.0.3
tzdb	0.4.0
ucminf	1.2.2
units	0.8-5
urca	1.3-4
urlchecker	1.0.1
urltools	1.7.3
usethis	2.2.3
utf8	1.2.4
uuid	1.2-0
uwot	0.1.16
vcd	1.4-12
vctrs	0.6.5
vdiffr	1.0.7
vegan	2.6-6.1
VGAM	1.1-11
viridis	0.6.5
viridisLite	0.4.2
visNetwork	2.1.2
vroom	1.6.5
waldo	0.5.2
warp	0.2.1
webutils	1.2.0
WhatIf	1.5-10
whisker	0.4.1
withr	3.0.0
wk	0.9.2
wkutils	0.1.3
workflows	1.1.4
workflowsets	1.1.0
writexl	1.5.0
xfun	0.45
xgboost	2.0.3.1
XML	3.99-0.17
xml2	1.3.6
xopen	1.0.1
xtable	1.8-4
xts	0.14.0
yaml	2.3.9
yardstick	1.3.1
yulab.utils	0.1.4
zeallot	0.1.0
zip	2.3.1
zoo	1.8-12

Installing additional packages

Ticket via HPC-Service

This is the preferred method if the module in question has been published on CRAN, since the package will be available to everyone using the available R modules. For more information on how to open a Ticket see Service and Support.

Self service

The R modules set R_LIBS_USER to a directory in your scratch area, e.g. /hpc/gpfs2/scratch/u/$USER/.R/4.3 , where packages will be installed. This typically works well unless your package requires dependencies like additional libraries. In this case (the installation will fail) open a Ticket!

CRAN packages are case sensitive!

via install.packages

install.packages("BlA")

via BiocManager::install

BiocManager::install("BlA")

Packages installed by Users (located in R_LIBS_USER) will always be preferred. If you require a newer (or different) version of a package that is already installed centrally, you can just install it yourself.

Performance considerations and common Pitfalls

Implicit multithreading

R is able to use implicit multithreading using a subset of optimized functions or functions taking advantage of parallelized BLAS/LAPACK routines. This is controlled by the environment variable OMP_NUM_THREADS which is set to 1 by default. If you know your code benefits from this, then you can increase it manually either within R or by changing the environment variable explicitly (and/or reducing the number of worker or MPI processes at the same time). Running benchmarks is mandatory when using this and don't expect miracles.

Using `parallel::detectCores()`

This will detect the number of CPU cores of the entire Compute-Node (typically 128), irrespective of the number of cores you requested via Slurm. When Slurm has allocated less than a full node for your job, this will lead to Node overloading (spawning more threads/processes than CPU cores allocated) and consequently inefficient Jobs. Therefore, do not use parallel::detectCores()at all on HPC clusters, always use either parallelly::availableCores() or future::availableCores()to determine the correct number of cores available in single-Node Jobs.

ncpus <- parallelly::availableCores()
options(mc.cores = ncpus) # set a global option for parallel packages
res<-mclapply(... ,mc.cores = ncpus)) # or set the number of cores per call

Poor scaling of parallel code

Don't expect your code to work well automatically if you just scale up the numer of CPU cores. The Job effeciency often drops with increasing amount of CPU cores and in some cases it may take even longer when using too much CPU cores.

Beware submitting large R jobs

Before you start submitting large jobs to the cluster, measure the parallel efficiency first. Parallel efficiency (actual scaling vs. ideal scaling) should be well above 50%! Failure to do so may result in official warnings and in extreme and repetitive cases to account suspension.

R is using more threads than CPU cores available

This is a common problem as setting OMP_NUM_THREADS=1 is no silver bullet to catch them all. Some R packages do not respect it and still launch 128 threads per process. Packages known for this behaviour:

package	solution
ranger	add `export R_RANGER_NUM_THREADS=1` to your Slurm script after loadthing the R module. Details.
randomForestSRC	add `export MC_CORES=1` and `export RF_CORES=1` to your Slurm script after loadthing the R module. Details.

If you encoutner a similar problem with other packages and you know the solution, let is know so that everyone can benefit by adding it to the list above.

Installing packages in a Job does not work

This is to be excepted when installing from sources that require internet access, which the compute nodes don't have. In general it is a bad idea to install packages from a production computation. Instead, install packages either interactively oder via an R-script on the login node. Due to the GPFS, the package will be available on all compute nodes immediately.

MPI with R and Slurm

Do not use MPI with R unless you have to. This might only be the case when 128 cores are not enough.

There are several packages that allow for MPI parallelization with R:

Rmpi (package that implements the low level MPI interface)
pbdMPI (S4 classes to directly interface MPI in order to support the Single Program/Multiple Data (SPMD) parallel programming style)
doMPI (high level package that enhances foreach with MPI, uses all requested CPU cores excluding the control process. E.g. if you request 16 MPI process, only 15 of them will be able to do the heavy lifting.)

There is one important caveat in using R with Slurm (mandatory): since Slurm is taking care of spawning MPI processes (requested processes are running throughout the lifetime of your Slurm Job), you cannot spawn the processes by yourself or dynamically by calling any of the mpi.spawn.* functions .

Private installations

Almost always there is no need for private installations. If you still think you need one, continue reading.

From conda-forge

Most R packages available on CRAN are also available via conda-forge. In fact, we use conda-forge for centrally managed R installations as well.

module load micromamba
micromamba create -n myspecialRenv -c conda-forge r-base==4.3.3 ...
micromamba activate myspecialRenv
micromamba install -c conda-forge r-morepackages

Self compilation

Although not recommended or supported, you may simply compile R yourself by loading an appropriate compiler module. You will have to manage dependencies yourself though.

Slurm Job templates

Single node Jobs (forking, threading)

#!/bin/bash
#SBATCH --time=00:20:00
#SBATCH --partition=epyc
#SBATCH --tasks=1
#SBATCH --cpus-per-task=16
#SBATCH --mem-per-cpu=2G

# Load the version of R you want to use
module purge
module load R

# Run your R script
srun Rscript test.R

MPI Jobs

#!/bin/bash
#SBATCH --time=00:20:00
#SBATCH --partition=epyc
#SBATCH --tasks-per-node=16
#SBATCH --nodes=2
#SBATCH --cpus-per-task=1
#SBATCH --mem-per-cpu=2G

# Load the version of R you want to use
module purge
module load R

# Run your R script
srun Rscript test.R

Bereichsverknüpfungen

Seitenhierarchie

General

Modules providing R

Installed packages

Installing additional packages

Ticket via HPC-Service

Self service

Performance considerations and common Pitfalls

Implicit multithreading

Using `parallel::detectCores()`

Poor scaling of parallel code

R is using more threads than CPU cores available

Installing packages in a Job does not work

MPI with R and Slurm

Private installations

From conda-forge

Self compilation

Slurm Job templates

Bereichsverknüpfungen

Seitenhierarchie

R and CRAN package management

General

Modules providing R

Installed packages

Installing additional packages

Ticket via HPC-Service

Self service

Performance considerations and common Pitfalls

Implicit multithreading

Using parallel::detectCores()

Poor scaling of parallel code

R is using more threads than CPU cores available

Installing packages in a Job does not work

MPI with R and Slurm

Private installations

From conda-forge

Self compilation

Slurm Job templates

Using `parallel::detectCores()`