CoolMUC-2: Open issues after the Cluster Hardware and Software Upgrade
After the maintenance of CoolMUC-2, the system is still in test operation. Quite a number of frequently used applications are working Ok. But there are still open issues with respect to stability of the system and the availability of the software stack as well. Main issues:
- Software: The provisioning of the LRZ software stack is still work in progress. Changes might be applied at short notice.
Please check the following tables for details on particular software packages and system issues.
Software Packages from the Spack stack that currently do not work yet
Package | Module name | Priority | Comments |
---|---|---|---|
ANSYS Mechanical | ansys | Version 2020.R1 Version 2019.R3 Version 2019.R2 all older versions of ANSYS Mechanical are not able to run due to library dependency errors All ANSYS Mechanical tests were based on Intel MPI 2018 (provided through SPACK). Working moduls published, documentation updated. | |
ANSYS Fluent | fluent | Version 2020.R1 - (with Intel MPI 2018) Version 2019.R3 - (with Intel MPI 2018) Version 2019.R2 - (with Intel MPI 2018) Version 2019.R1 - (with Intel MPI 2018) Version 19.2 - works only with IBM-MPI : "-mpi=ibmmpi -pib.dapl", see documentation and module comments Version 19.1 - works only with IBM-MPI : "-mpi=ibmmpi -pib.dapl", see documentation and module comments Earlier ANSYS Fluent versions 19.1 and 19.2 are provided, but are no longer recommended and will be retired soon. | |
ANSYS CFX | cfx | Version 2020.R1 (with Intel MPI 2018) Version 2019.R3 (with Intel MPI 2018) Version 2019.R2 (with Intel MPI 2018) Version 2019.R1 (with Intel MPI 2018) Version 19.2 (with Intel MPI 2018, also the software originally uses Intel MPI 2017 - to be retired soon!) Version 19.1 (with Intel MPI 2018, also the software originally uses Intel MPI 2017 - to be retired soon!) Working modules of ANSYS CFX published on the module system. ANSYS CFX documentation updated. | |
ANSYS ICEM/CFD | icem | tested, Ok. | |
ANSYS EM | ansysedt | concerns HFSS and Maxwell-2d/-3d : SLES15 is a not supported OS; therefore almost guaranteed not to work on CM2 HFSS and Maxwell, Version 2019.R3 work under SLURM on SLES12 on CoolMUC-3. | |
ANSYS Ensight | ensight | tested, Ok. | |
ANSYS Workbench | wb | please use on RVS only (which is still SLES12) | |
COMSOL | comsol | Tested Version 5.5 (best chance to get it running): https://www.comsol.de/system-requirements (SLES 15 is supported) Internal MPI Support: Intel(R) MPI Library, Version 2018 Update 2 Build 20180125 (id: 18157) External MPI Support: via -mpiroot flag
| |
OpenFOAM | openfoam | Spack-provided Modules are installed in Spack 20.1.1. | |
Paraview | paraview | new Spack-module are installed in Spack 20.1.1. | |
CP2k | cp2k | 2 | ELPA version in the build is not supported; as an intermediate remedy use `PREFERRED_DIAG_LIBRARY SL` in the `@global` section of your input. |
Molden | molden | fixed with `spack/staging/20.1.1` | |
Cube Analysis Tool | cube | fixed with `spack/staging/20.1.1` | |
MSC Nastran | mscnastran | Version 20182 - - tested Version 20190 - - tested Version 20191 - - tested Version 20200 - - tested | |
Scalable Molecular Dynamics | namd | 2 | |
NetCDF | netcdf | new Spack-module is installed in Spack 20.1.1. | |
Quantum Espresso | quantum-espresso | fixed with `spack/staging/20.1.1` | |
Scalasca Analysis Toolkit | scalasca | fixed with `spack/staging/20.1.1` | |
Siemens PLM StarCCM+ | starccm starccm_dp | Version 2020.1.1 - fixed for Intel MPI 2018 : "-mpi intel", works with "-mpi openmpi -fabric ofi"; Version 2020.1 - fixed for Intel MPI 2018 : "-mpi intel", works with "-mpi openmpi -fabric ofi"; Version 2019.3.1 - fixed for Intel MPI 2018 : "-mpi intel", works with "-mpi openmpi -fabric ofi"; Version 2019.2.1 - tested and working with Intel MPI 2018 : "-mpi intel", works with "-mpi openmpi -fabric ibv"; Working modules of StarCCM+ published on the module system. StarCCM+ documentation updated. | |
Intel MPI legacy versions | intel-mpi/2018 | SPACK module provided and successfully tested. As long as cgroups is deactivated (see list entry below), no Out-of-Memory errors at the end of Intel MPI 2018 jobs should occure anymore. | |
Matlab | matlab/R2019a_Update5-generic matlab/R2019b-generic |
| |
GNU Compiler 9.3.0 | gcc/9.3.0(-nv) | g++ compiler requires |
System Services that do not work or have operational restrictions
Item | Status | Comments |
---|---|---|
dssusrinfo command | fully available since 13:00 | |
SLURM control of resource usage | workaround applied on | cgroups have for now been deactivated since they appear to have too many side effects (spurious out-of-memory kills. |
Filesystem issues | correction applied to Infiniband setup on | We currently believe that job failures by not being able to load module depdendencies caused by non-availability of the GPFS filesystem, |
salloc does not work | available | Please use lxlogin1, ..., lxlogin4 to submit interactive jobs on cm2_inter! Please use lxlogin8 to submit interactive jobs on mpp3_inter! |
sporadic job crashes due to Node Failure | available since | The node failure problems should now be resolved. However, you need to submit the jobs into the new, separate SLURM cluster cm2_tiny now.
#SBATCH --cluster=cm2_tiny #SBATCH --partition=cm2_tiny General advice Although, jobs on cluster cm2_tiny work without setting the partition name, we highly recommend to define both cluster name and partition name in all job scripts! |
ANSYS Mechanical with User Fortran fails | resolved in spack/staging/20.1.1 | Issue fixed and spack/staging/20.1.1 rolled out. |