Please read our student and staff community guidance on COVID-19
Skip to main content Skip to navigation

PX425 High Performance Computing in Physics

Lecturer: Nick Hine
Weighting: 7.5 CATS

The module will address the increased use of computer simulation and data analysis on high performance computers in all fields of computational physics and other sciences. Computing skills are greatly valued across science and beyond and we encourage students to go as far as they can to develop such skills.

To explain the methods used in computer simulations and data analysis on high performance computers, for research in all fields of computational physics and other sciences.

At the end of this module you should be able to:

  • Identify and correct common inefficiencies in both serial scientific computer codes
  • Choose an appropriate programming paradigm for a particular problem or hardware architecture
  • Write a parallel program using shared-memory or message passing constructs in a physics context
  • Write a simple GPU accelerated program.
  • Identify sources of performance bottlenecks in parallel computer programs and understand how these relate to the computer architecture
  • Use batch systems to access parallel computing hardware and to validate the correctness of a parallel computer program vs equivalent serial software


Programming for efficiency. Modern cache architectures and CPU pipelining. Avoiding expensive and repeated operations. Compiler optimisation flags. Profiling with gprof.

Introduction to parallel computing. Modern HPC hardware and parallelisation strategies. Applications in Physics, super problems need super-computers.

Shared memory programming. The OpenMP standard. Parallelisation using compiler directives. Threading and variable types. Loop and sections constructs. Program correctness and reproducibility. Scheduling and false sharing as factors influencing performance.

Distributed memory programming. The MPI standard for message passing. Point-to-point and collective communication. Synchronous vs asynchronous communication. MPI communicators and topologies.

GPU programming. CUDA vs OpenCL. Kernels and host-device communication. Shared and constant memory, synchronicity and performance. GPU coding restrictions.

Limitations to parallel performance. Strong vs weak scaling. Amdahl’s law. Network contention in modern many-core architectures. Mixed mode OpenMP+MPI programming.

Commitment: 15 Lectures + 5 Laboratory Sessions

Assessment: Assignments (100%)

Recommended Texts: R Chandra et. al,. Parallel Programming in OpenMP , Morgan Kaufmann,
P Pacheco, Parallel Programming with MPI Morgan Kaufmann
M Quinn, Parallel Programming in C with MPI and OpenMP McGraw-Hill
D Kirk and W Hwu, Programming Massively Parallel Processors Elsevier

Module Homepage

This module has its own website where lecture notes and other resources are available.

Leads from: PX390 Scientific Computing A good working knowledge of a scientific programming language preferably C is essential.