Distributed Deep Learning with Horovod Training Course

Horovod is an open source software framework, designed for processing fast and efficient distributed deep learning models using TensorFlow, Keras, PyTorch, and Apache MXNet. It can scale up a single-GPU training script to run on multiple GPUs or hosts with minimal code changes.

This instructor-led, live training (online or onsite) is aimed at developers or data scientists who wish to use Horovod to run distributed deep learning trainings and scale it up to run across multiple GPUs in parallel.

By the end of this training, participants will be able to:

Set up the necessary development environment to start running deep learning trainings.
Install and configure Horovod to train models with TensorFlow, Keras, PyTorch, and Apache MXNet.
Scale deep learning training with Horovod to run on multiple GPUs.

Format of the Course

Interactive lecture and discussion.
Lots of exercises and practice.
Hands-on implementation in a live-lab environment.

Course Customization Options

This course is focused on Horovod, but other software tools and frameworks such as TensorFlow, Keras, PyTorch, and Apache MXNet may be required. Please let us know if you have specific requirements or preferences.
To request a customized training for this course, please contact us to arrange.

This course is available as onsite live training in United Kingdom or online live training.

Thank you for sending your enquiry! One of our team members will contact you shortly.

Thank you for sending your booking! One of our team members will contact you shortly.

Course Outline

Introduction

Overview of Horovod features and concepts
Understanding the supported frameworks

Installing and Configuring Horovod

Preparing the hosting environment
Building Horovod for TensorFlow, Keras, PyTorch, and Apache MXNet
Running Horovod

Running Distributed Training

Modifying and running training examples with TensorFlow
Modifying and running training examples with Keras
Modifying and running training examples with PyTorch
Modifying and running training examples with Apache MXNet

Optimizing Distributed Training Processes

Running concurrent operations on multiple GPUs
Tuning hyperparameters
Enabling performance autotuning

Troubleshooting

Summary and Conclusion

Requirements

An understanding of Machine Learning, specifically deep learning
Familiarity with machine learning libraries (TensorFlow, Keras, PyTorch, Apache MXNet)
Python programming experience

Audience

Developers
Data scientists

7 Hours

Delivery Options

Private Group Training

Our identity is rooted in delivering exactly what our clients need.

Pre-course call with your trainer
Customisation of the learning experience to achieve your goals -

Bespoke outlines
Practical hands-on exercises containing data / scenarios recognisable to the learners

Training scheduled on a date of your choice
Delivered online, onsite/classroom or hybrid by experts sharing real world experience

Private Group Prices RRP from £1900 online delivery, based on a group of 2 delegates, £600 per additional delegate (excludes any certification / exam costs). We recommend a maximum group size of 12 for most learning events.

Distributed Deep Learning with Horovod Training Course

Course Outline

Requirements

Delivery Options

Private Group Training

Public Training

Provisional Upcoming Courses (Contact Us For More Information)

Distributed Deep Learning with Horovod

Distributed Deep Learning with Horovod

Distributed Deep Learning with Horovod

Distributed Deep Learning with Horovod

Distributed Deep Learning with Horovod

Distributed Deep Learning with Horovod

Related Categories

This site in other countries/regions

Europe

Asia Pacific

North America

South America

Africa / Middle East

Other sites