2023-10-05 Data Parallelism - How to Train Deep Learning Models on Multiple GPUs (hdli1w23)

CourseData Parallelism - How to Train Deep Learning Models on Multiple GPUs
Numberhdli1w23
Available places5
Date05.10.2023 – 05.10.2023
PriceEUR 0.00
LocationLeibniz Rechenzentrum
Boltzmannstr. 1
85748 Garching b. München
RoomKursraum 2
Registration deadline21.09.2023 23:59
E-maileducation@lrz.de


This is an on-site course at LRZ in Garching near Munich. Participants are expected to bring their own laptops running the latest version of Chrome or Firefox. There are no PCs installed in the course room!

Contents

Modern deep learning challenges leverage increasingly larger datasets and more complex models. As a result, significant computational power is required to train models effectively and efficiently. Learning to distribute data across multiple GPUs during deep learning model training makes possible an incredible wealth of new applications utilizing deep learning.

Additionally, the effective use of systems with multiple GPUs reduces training time, allowing for faster application development and much faster iteration cycles. Teams who are able to perform training using multiple GPUs will have an edge, building models trained on more data in shorter periods of time and with greater engineer productivity.

This workshop teaches you techniques for data-parallel deep learning training on multiple GPUs to shorten the training time required for data-intensive applications. Working with deep learning tools, frameworks, and workflows to perform neural network training, you’ll learn how to decrease model training time by distributing data to multiple GPUs, while retaining the accuracy of training on a single GPU.

The course is co-organised by LRZ and NVIDIA Deep Learning Institute (DLI).  All instructors are NVIDIA certified University Ambassadors.

Learning Objectives

By participating in this workshop, you’ll:

  • Understand how data parallel deep learning training is performed using multiple GPUs
  • Achieve maximum throughput when training, for the best use of multiple GPUs
  • Distribute training to multiple GPUs using Pytorch Distributed Data Parallel
  • Understand and utilize algorithmic considerations specific to multi-GPU training performance and accuracy

Important information

After you are accepted, please create an account under courses.nvidia.com/join .

Ensure your laptop / PC will run smoothly by going to http://websocketstest.com/ Make sure that WebSockets work for you by seeing under Environment, WebSockets is supported and Data Receive, Send and Echo Test all check Yes under WebSockets (Port 80).If there are issues with WebSockets, try updating your browser.

NVIDIA Deep Learning Institute

The NVIDIA Deep Learning Institute delivers hands-on training for developers, data scientists, and engineers. The program is designed to help you get started with training, optimising, and deploying neural networks to solve real-world problems across diverse industries such as self-driving cars, healthcare, online services, and robotics.

Screen Shot 2017-12-13 at 12.24.46 

Prerequisites

Experience with deep learning training using Python.

Hands-On

The lectures are interleaved with many hands-on sessions using Jupyter Notebooks. The exercises will be done on a fully configured GPU-accelerated workstation in the cloud.

Language

English

Lecturers

PD Dr. Juan Durillo Barrionuevo (LRZ, NVIDIA certified University Ambassador)

Prices and Eligibility

The course is open and free of charge for people from academia from the Member States (MS) of the European Union (EU) and Associated Countries to the Horizon 2020 programme.

Registration

Please register with your official e-mail address to prove your affiliation.

Withdrawal Policy

See Withdrawal

Legal Notices

For registration for LRZ courses and workshops we use the service edoobox from Etzensperger Informatik AG (www.edoobox.com). Etzensperger Informatik AG acts as processor and we have concluded a Data Processing Agreement with them.

See Legal Notices


No.

Date

Time

Leader

Location

Room

Description

105.10.202310:00 – 17:00Juan Durillo Barrionuevo
LRZ Events
Leibniz RechenzentrumKursraum 2Lecture