HDF5
HDF5 (Hierarchical Data Format Version 5) is a general purpose library and file format for storing scientific data. HDF5 can store two primary objects: datasets and groups. A dataset is essentially a multidimensional array of data elements, and a group is a structure for organizing objects in an HDF5 file. Using these two basic objects, one can create and store almost any kind of scientific data structure, such as images, arrays of vectors, and structured and unstructured grids. You can also mix and match them in HDF5 files according to your needs.
Installation and Use of HDF5 on LRZ platforms
Linux based HPC Systems
As of April 2022, there is a new software stack 22.2.1 available on CoolMUC2 and SuperMUC-NG. We provide at least one minor version of HDF5 1.8 and 1.10, where you need to be careful as these versions have different formats/APIs.
For the available hdf5 modules you can check yourself via
module avail hdf5
On spack stack 22.2.1 we provide the following modules:
Serial HDF5 | HDF5 MPI parallel (with Intel-MPI) |
---|---|
hdf5/1.8.22-gcc11 hdf5/1.8.22-intel21 hdf5/1.10.7-gcc11 hdf5/1.10.7-intel19 | hdf5/1.8.22-gcc11-impi hdf5/1.8.22-intel21-impi hdf5/1.10.7-gcc11-impi hdf5/1.10.7-intel21-impi |
The suffixes "-gcc11" and "-intel21" represent the used compilers and the corresponding compiler modules should be loaded when using the modules. The suffix "-impi" stands for the MPI parallel version built with the Intel-MPI standard module.
All packages are built with C, C++ and Fortran support. To make use of HDF5, please load the appropriate Environment Module
For the parallel version with Intel compiler, e.g. use
module load hdf5/1.10.7-intel21-impi
Then, compile your code with
[mpicc|mpicxx|mpif90] -c $HDF5_INC foo.[c|cc|f90]
and link it with
[mpicc|mpicxx|mpif90] -o myprog foo.o <further objects> [$HDF5_F90_SHLIB|$HDF5_CPP_SHLIB] $HDF5_SHLIB
For a serial version (with Intel compiler), e,g, use
module load hdf5/1.10.7-intel21
Then, compile your code with
[icc|icpc|ifort] -c $HDF5_INC foo.[c|cc|f90]
and link it with
[icc|icpc|ifort] -o myprog.exe foo.o <further objects> [$HDF5_F90_SHLIB|$HDF5_CPP_SHLIB] $HDF5_SHLIB
One of the language support libraries $HDF5_F90_SHLIB or $HDF5_CPP_SHLIB is only required if either Fortran or C++ are used for compiling and linking your application.
For static linking, use $HDF5_..._LIB versions instead of $HDF5_..._SHLIB, but this not recommended.
Utilities
Loading an HDF5 module typically will also make available command-line utilities e.g., h5copy, h5debug, h5dump etc. It may be advisable to run these utilities using a serial (as opposed to MPI parallel) HDF5 version, since a linked-in MPI library may not work properly in purely interactive usage.
h5utils
h5utils (Github) is a set of utilities for the visualisation and conversion of scientific data in HDF5 format. Besides providing a simple tool for batch visualisation as PNG images, h5utils also includes programs to convert HDF5 datasets into the formats required by other free visualization software (e.g. plain text, Vis5d, and VTK).
h5utils is not part of the HDF5 module, nor is it available directly in the LRZ provided software stack. The recommended procedure to install this software on SuperMUC-NG, CoolMUC-2 and other LRZ managed clusters is to install it via user-spack:
module load user_spack # Install spack info h5utils spack install h5utils # Load to search path spack load h5utils # Unload spack unload h5utils
Documentation
Please refer to the HDF5 Web Site for documentation of the interface.
H5py (Pythonic Interface to HDF5)
There are several options to install h5py on LRZ systems. One option is using "pip
" or "Conda
" (see https://doku.lrz.de/display/PUBLIC/Python+for+HPC for details). The other option (and probably preferable) is the installation in your $HOME folder via "user_spack", which is the LRZ adaption of the Spack package management tool. The installation procedure is similar on all systems.
To create a module for h5py you need to specify the hdf5 module you want to work with. Let us assume you want to use the module hdf5/1.10.7-gcc8-impi
on CoolMUC2 which is built with GCC as a compiler. For all other hdf5 modules the installation is analogue. To build h5py with this hdf5 version, we need the hash of the Spack installation which can be obtained using "module show
":
cm2login3:~> module show hdf5/1.10.7-gcc8-impi | grep BASE setenv HDF5_BASE /dss/dsshome1/lrz/sys/spack/release/21.1.1/opt/haswell/hdf5/1.10.7-gcc-2iitq6x
The hash consists of the last seven characters: 2iitq6x Please note: The hashes of the installations differ on all systems. So using the hash from above for an installation on e.g. SuperMUC-NG will fail.
The installation (which you only need to do once if it worked) is then done via the following steps:
Installation
Prerequesites
First we need to create our own spack repository in our home directory in $HOME/spack/repos
and copy the modified package.py
there.
module unload intel-mkl intel-mpi intel gcc # unload all compiler modules as they are not needed at this point module switch spack/21.1.1 # unnecessary as soon as the module user_spack/release/22.2.1 is available module load user_spack
Installation
Now we need the compiler used to build the hdf5 module and the hash of the installation (see above). The general installation command looks like this:
spack install py-h5py%COMPILER ^hdf5/HASH_OF_INSTALLATION
where COMPILER
stands for the compiler of the hdf5 module which can be gcc
or intel
(note: no version numbers needed here) and HASH_OF_INSTALLATION
is the installation hash (see above). So for our example this would be
spack install py-h5py%gcc ^hdf5/2iitq6x
Module Creation
If the steps above were successful, we need to create the module for the h5py installation:
spack module tcl refresh -y
The module is then generated in the directory $HOME/spack/modules/x86_avx2/linux-sles15-haswell/
.
Note: The subfolder x86_avx2
in the path $HOME/spack/modules/x86_avx2/linux-sles15-haswell/
to the module differs on other systems. On e.g. SuperMUC-NG the path would be $HOME/spack/modules/x86_avx512/linux-sles15-skylake_avx512/ .
Using the Module
To use the the h5py module, you need to make the module available to the module system and you also need to load the corresponding hdf5 module. The following four lines are the lines that you need to put in your SLURM script:
module use -p $HOME/spack/modules/x86_avx2/linux-sles15-haswell/ module load python/3.8.8-extended module load hdf5/1.10.7-gcc8-impi module load py-h5py
If you encounter any problems, please to contact our Servicedesk