This course will be held in Edinburgh on 6-8 December 2011.
Description of Content
Basic parallelisation of loops using OpenMP is relatively straightforward. However, achieving good performance can be challenging. Effects such as false sharing and cache thrashing between threads can cripple the performance in practice even if a loop is, in principle, 100% parallelisable. In addition the overheads of launching and synchronising threads can sometimes outweigh the parallel benefits. This course looks at techniques to manage the memory hierarchy more efficiently, and to reduce OpenMP overheads using extended parallel regions, orphaned directives and more loosely synchronous execution. It will also cover advanced use of more recent OpenMP features such as nested parallelism and how to use OpenMP tasks in real codes. Optimal usage requires some understanding of how the OpenMP runtime library creates and manages the thread team, so the core features of OpenMP implementations will also be described.
Students should be competent programmers in C, C++ or Fortran but do not need to have any prior HPC or parallel programming experience.
Please note that we expect most attendees to bring their own laptops; however, we can supply access to terminals if required - please let us know in advance.