4. Introduction to Enroot: The Software Stack Provider for the LRZ AI Systems

The LRZ AI Systems run jobs within containers, therefore allowing users to define their required custom software stacks. In particular, the Enroot container framework provided by NVIDIA is used. The Enroot container runtime operates completely in userspace (i.e. it does not require root privileges).
Please note:

  1. Enroot is currently NOT available on the SSH-login nodes of the LRZ AI Systems, but only on the compute nodes. Launch an interactive session and work on a compute node to make direct use of this tool.
  2. Tools like module , conda and pip are not officially supported on the LRZ AI Systems. The only supported way to define your own software stacks (environments) to run your codes is by using Enroot containers.

Enroot

The LRZ AI Systems use the Enroot container framework https://github.com/NVIDIA/enroot/. Enroot allows you to run containers defined by different container images. In particular, Docker container images are supported. So, in principle, all you need is a Docker container image you have obtained from Docker Hub or from the NVIDIA NGC Cloud (which are in fact also Docker container images). Check at 5. Using NVIDIA NGC Containers on the LRZ AI Systems and the documentation at https://github.com/NVIDIA/enroot/blob/master/doc/cmd/import.md if you want to make use of this latter option. Notice that pulling and using NVIDIA NGC containers requires registration (https://ngc.nvidia.com/signin.) 

Generally, running containerised applications with the Enroot container framework uses a slightly different workflow than the one you might already know from other container technologies. For most of the use cases you might be confronted with, you need to know at most three commands: enroot import, enroot create, and enroot start (although it is recommended to have a look to the documentation of Enroot.)

The Enroot Workflow

As the Enroot container runtime is currently NOT available on the SSH-login nodes of the LRZ AI Systems, please launch an interactive session prior to following any of the steps below.

Creating an Enroot Container Image with import

If there is a container image in a container registry (i.e., Docker or NVIDIA NGC) that you want to use for your running your job, then the first step is to import that image into an Enroot image within your machine. The command enroot import can be used. 

$ enroot import docker://ubuntu

This command imports an existing container image in an existing container repository (e.g., Docker or Nvidia NGC) and creates an image that can be interpreted/read by Enroot (i.e. an Enroot image.) The newly created Enroot image has the same name as the imported image but with the .sqsh extension. This image can then be used to create Enroot containers.

This step only needs to be performed once as many containers can be created out of that image. 

Creating and Enroot Container with create

Once you have an Enroot image, you generally want to create an Enroot container for running your application within it. You need to use the enroot create command to expand the Enroot image into a proper file system that will be stored locally (if you need two containers out of a single image, you need to run this command twice; please read the official Enroot documentation https://github.com/NVIDIA/enroot/blob/master/doc/usage.md in order to check how to assign different names to different Enroot containers.) You will have one such file system for every Enroot container you create. The command synopsis is as follows:

$ enroot create --name <name-to-give-the-container> <path-to-an-Enroot-image>

If the --name option is not provided, the container takes the same name as the image (without the .sqsh extension.) 

An specific example of using this command is:

$ enroot create ubuntu.sqsh

In this example, a container named ubuntu will be created out of the image ubuntu.sqsh which must exist in the working directory. 

Running Software Inside an existing Enroot Container with start

Once you have an Enroot container, you can run an application within the boundaries of that container (i.e. with the software stack defined by that container.) Use the enroot start command. It allows you to either run a hook script within the container (please consult the chosen container image documentation to find the path to this script) or by indicating the executable to be run as argument in the command line.

$ enroot start ubuntu

This runs an application within an Enroot container. By default the application is defined by a hook script within the container but can also be passed as argument. 

If you need to run something as root inside the container, you can use the --root option. Remember: you are root only inside the container, not the machine where the container is running. 

$ enroot start --root ubuntu

Importing Nvidia NGC Containers

The catalogue of available Nvidia NGC containers can be consulted here: https://ngc.nvidia.com/catalog/containers. To import (pull if using docker terminology) these containers you need an API key, which is associated to your Nvidia NGC account. You can generate your API key here: https://ngc.nvidia.com/setup/api-key. For the rest of this section, let us refer to your generated API key as <API_KEY>. 

To configure Enroot for using your API key, create the file enroot/.credentials within your $HOME and append the following line to it:

machine nvcr.io login $oauthtoken password <API_KEY>

where <API_KEY> is the key generated as described above. 

After doing this, you can import containers from Nvidia NGC. For example, the latest tensorflow container can be imported as indicated below. 

 $ enroot import docker://nvcr.io#nvidia/tensorflow:20.12-tf1-py3