AI Systems

AI Systems / BayernKI

 


GENERAL NOTICE

  • This system is currently in pilot operation.
  • LRZ AI cluster will host the EuroCC AI Hackathon.
    • The event will take place from 7 to 23 October 2025.
    • During this period, some LRZ resources will be reserved for this event's participants. 
  • Notice: High Storage Utilization
    • Our current storage systems are under high demand across all users.
    • Please use your allocated storage space efficiently and remove unnecessary data where possible.
    • Additional storage capacity is planned and will be available by mid next year (2026).

[i] Shut-Down of HPC systems due to power-supply issues!
Do., 16.10.2025 10:55 – voraussichtlich bis Fr., 17.10.2025 17:15
Betroffene Services: [Linux Cluster],[Hoechstleistungsrechner],[AI Systems]

17.10.2025 - 17:15: Power supply issues for the HPC and AI systems have been resolved. All systems are back in operation. Singular nodes might still need extra attention. Please report any persistent issues. We will sort them out next week.

-—-

17.10.2025 - 16:45: Linux Cluster is back in operation.

-—-

17.10.2025 - 16:10: The AI Systems are back in operation.

-—-

17.10.2025 - 12:00: SuperMUC-NG Phase 2 is back in operation.

-—-

16.10.2025 - 19:30: Operation of SuperMUC-NG Phase 1 has restarted. The system is up and queues are running. Phase 2 will follow tomorrow.

-—-

16.10.2025 - 10:45 All HPC systems have been shut down due to power-supply issues! We are working to restore system operation as soon as possible.



The LRZ AI Systems are designed to support the scientific community in the fields of Big Data and Artificial Intelligence, with a particular emphasis on GPU-intensive workloads. 

Access to the AI Systems requires a valid LRZ user account (see 1. Access).

The compute resources are funded through the BayernKI initiative (see 2. Compute).

For storage, the full range of LRZ Data Science Storage (DSS) services is available (see 3. Storage).

The primary use case is the deployment of containerized environments using the NVIDIA Enroot framework (see 4. Enroot).

Workload scheduling and resource management are provided by the Slurm resource manager (see 5. Slurm).


Documentation


Acknowledgement

Please use the following template for the acknowledgement of resources and support provided by the LRZ.

The authors gratefully acknowledge the scientific support and resources of the AI service infrastructure LRZ AI Systems provided by the Leibniz Supercomputing Centre (LRZ) of the Bavarian Academy of Sciences and Humanities (BAdW), funded by Bayerisches Staatsministerium für Wissenschaft und Kunst (StMWK).