Accelerating CUDA C++ Applications with Multiple GPUs
Description
Computationally intensive CUDA® C++ applications in high-performance computing, data science, bioinformatics, and deep learning can be accelerated by using multiple GPUs, which can increase throughput and/or decrease your total runtime. When combined with the concurrent overlap of computation and memory transfers, computation can be scaled across multiple GPUs without increasing the cost of memory transfers. For organisations with multi-GPU servers, whether in the cloud or on NVIDIA DGX™ systems, these techniques enable you to achieve peak performance from GPU-accelerated applications. And it’s important to implement these single-node, multi-GPU techniques before scaling your applications across multiple nodes.
This course covers how to write CUDA C++ applications that efficiently and correctly utilise all available GPUs in a single node, dramatically improving the performance of your applications and making the most cost-effective use of systems with multiple GPUs.
The course is co-organised by LRZ and NVIDIA Deep Learning Institute (DLI). NVIDIA DLI offers hands-on training for developers, data scientists, and researchers looking to solve challenging problems with deep learning.
Learning Objectives
By participating in this workshop, you’ll:
- Use concurrent CUDA streams to overlap memory transfers with GPU computation
- Utilise all available GPUs on a single node to scale workloads across all available GPUs
- Combine the use of copy/compute overlap with multiple GPUs
- Rely on the NVIDIA Nsight™ Systems Visual Profiler timeline to observe improvement opportunities and the impact of the techniques covered in the workshop
Agenda
10:00 - 10:15 Welcome and Introduction
10:15 - 12:00 Lecture 1: Introduction & Main Objectives
12:00 - 13:00 Lunch break
13:00 - 14:45 Lecture 2: Copy/Compute Overlap
14:45 - 15:00 Coffee break
15:00 - 16:00 Lecture 3: Multiple GPUs
Material
Lecturer
Dr. Momme Allalen (LRZ, NVIDIA certified University Ambassador)
Survey
Please fill out the online survey under https://tinyurl.com/survey-hdli2w21
This helps us to
- increase the quality of the courses,
- design the future training programme at LRZ according to your needs and wishes,
- get future funding for training events.
NVIDIA Deep Learning Institute
The NVIDIA Deep Learning Institute delivers hands-on training for developers, data scientists, and engineers. The program is designed to help you get started with training, optimizing, and deploying neural networks to solve real-world problems across diverse industries such as self-driving cars, healthcare, online services, and robotics.
TRAINING SETUP
To get started, follow these steps:
- Create an NVIDIA Developer account at http://courses.nvidia.com/join Select "Log in with my NVIDIA Account" and then '"Create Account".
- Make sure that WebSockets works for you:
- Test your Laptop at http://websocketstest.com
- Under ENVIRONMENT, confirm that '"WebSockets" is checked yes.
- Under WEBSOCKETS (PORT 80]. confirm that "Data Receive", "Send", and "Echo Test" are checked yes.
- lf there are issues with WebSockets, try updating your browser.
We recommend Chrome, Firefox, or Safari for an optimal performance. - Visit http://courses.nvidia.com/dli-event and enter the event code provided by the instructor.
- You're ready to get started. Please complete the survey at the end of the course to share your feedback.
NEXT STEPS
Visit the NVIDIA Deep Learning lnstitute's website at http://www.nvidia.co.uk/dli to access more training and resources.
- Start online, self-paced training in deep learning and accelerated computing (using the account you created today).
- View upcoming workshops around the world and request an onsite workshop at your company or organization.
- Learn about the University Ambassador Program.
Ready to kick off a deep learning project or already working on one? Choose the best software and hardware solutions at
http://www.nvidia.co.uk/deep-learning-ai/developer/