Thank you for sending your enquiry! One of our team members will contact you shortly.
Thank you for sending your booking! One of our team members will contact you shortly.
Course Outline
Introduction
- What is OpenACC?
- OpenACC vs OpenCL vs CUDA vs SYCL
- Overview of OpenACC features and architecture
- Setting up the development environment
Getting Started
- Creating an OpenACC project in Visual Studio Code
- Exploring project structure and files
- Compiling and running the program
- Displaying output with printf and fprintf
OpenACC Directives and Clauses
- Understanding OpenACC directives and clauses
- Using parallel directives for creating parallel regions
- Using kernels directives for compiler-managed parallelism
- Using loop directives for parallelizing loops
- Managing data movement with data directives
- Synchronizing data with update directives
- Improving data reuse with cache directives
- Creating device functions with routine directives
- Synchronizing events with wait directives
OpenACC API
- Understanding the role of OpenACC API
- Querying device information and capabilities
- Setting device number and type
- Handling errors and exceptions
- Creating and synchronizing events
OpenACC Libraries and Interoperability
- Understanding OpenACC libraries and interoperability
- Using math, random, and complex libraries
- Integrating with other models (CUDA, OpenMP, MPI)
- Integrating with GPU libraries (cuBLAS, cuFFT)
OpenACC Tools
- Understanding OpenACC tools in development
- Profiling and debugging OpenACC programs
- Performance analysis with PGI Compiler, NVIDIA Nsight Systems, Allinea Forge
Optimization
- Factors affecting OpenACC program performance
- Optimizing data locality and reducing transfers
- Optimizing loop parallelism and fusion
- Optimizing kernel parallelism and fusion
- Optimizing vectorization and auto-tuning
Summary and Next Steps
Requirements
- An understanding of C/C++ or Fortran language and parallel programming concepts
- Basic knowledge of computer architecture and memory hierarchy
- Experience with command-line tools and code editors
Audience
- Developers who wish to learn how to use OpenACC to program heterogeneous devices and exploit their parallelism
- Developers who wish to write portable and scalable code that can run on different platforms and devices
- Programmers who wish to explore the high-level aspects of heterogeneous programming and optimize their code productivity
28 Hours
Testimonials (2)
Very interactive with various examples, with a good progression in complexity between the start and the end of the training.
Jenny - Andheo
Course - GPU Programming with CUDA and Python
Trainers energy and humor.