Book of Abstracts

The Book of Abstracts as a pdf is available to download hereLink opens in a new window.

Talks

Day 2: Mon 13th of March

9:20 - 10:10 : Tutorial

TJ McKinleyLink opens in a new window (University of Exeter) A tutorial on history matching with emulation for epidemic models

10:10 - 10:30 : Coffee Break

10:30 - 12:00 : Keynotes

V MininLink opens in a new window (University of California Irvine) Fitting stochastic epidemic models to noisy surveillance data: are we there yet?
Stochastic epidemic models describe how infectious diseases spread through a population of interest. These models are constructed by first assigning individuals to compartments (e.g., susceptible, infectious, and recovered) and then defining a stochastic process that governs the evolution of sizes of these compartments through time. I will review multiple lines of attack of a challenging and not fully solved problem of fitting these models to noisy infectious disease surveillance data. These solutions involve a range of mathematical techniques: particle filter Markov chain Monte Carlo algorithms, approximations of stochastic differential equations, and Poisson random measure-based Bayesian data augmentation. Importantly, many of these computational strategies open the door for integration of multiple infectious disease surveillance data streams, including less conventional ones (e.g., pathogen wastewater monitoring and genomic surveillance). Such data integration is critical for making key parameters of stochastic epidemic models identifiable. I will illustrate the state-of-the-art statistical inference for stochastic epidemic models using Influenza, Ebola, and SARS-CoV-2 surveillance data and will conclude with open problems and challenges that remain to be addressed.
C FuchsLink opens in a new window (Universität Bielefeld) Integrative modelling of infections in a corona virus cohort study
Since the outbreak of the corona pandemic, many mathematical and statistical approaches have been used to identify drivers of infection and disease, to advise decision makers, and to predict the further course. However, models that worked well in idealized contexts often encountered limitations in real-world situations; these include unreliable information on numbers of infected individuals, and a rapidly changing environment regarding control measures, varying testing and vaccination capacities and strategies, weather conditions, and new virus variants.
I will report from our work in the KoCo19 Consortium, which has established a representative cohort for the Munich population in private households and has collected and analyzed laboratory and questionnaire data in several rounds since April 2020. We estimate the prevalence of SARS-CoV-2 in the population, evaluate the reliability of diagnostics, investigate the role of infection within households, and combine different data sources to reliably estimate the effectiveness of non-pharmaceutical intervention methods. This is done using a variety of statistical methods, not least Bayesian modelling.

12:00 - 13:30 : Lunch Break

13:30 - 15:00 : Invited session on Environmental Stochasticity

T KypraiosLink opens in a new window (University of Nottingham) Bayesian nonparametric inference for stochastic infectious disease models
Infectious disease transmission models require assumptions about how the pathogen spreads between individuals. These assumptions may be somewhat arbitrary, particularly when it comes to describing how transmission varies between individuals of different types or in different locations, and may in turn lead to incorrect conclusions or policy decisions. In this talk, we will present a novel and general Bayesian nonparametric framework for transmission modelling which removes the need to make such specific assumptions with regards to the infection process. We use multi-output Gaussian process prior distributions to model different infection rates in populations containing multiple types of individuals. Further challenges arise because the transmission process itself is unobserved, and large outbreaks can be computationally demanding to analyse. We address these issues by data augmentation and a suitable efficient approximation method. Simulation studies using synthetic data demonstrate that our framework gives accurate results. Finally, we use our methods to enhance our understanding of the transmission mechanisms of the 2001 UK Foot and Mouth Disease outbreak.
PJ BirrellLink opens in a new window (UKHSA) An approximate diffusion process for environmental stochasticity in infectious disease transmission modelling
Throughout the course of the SARS-CoV-2 pandemic we fulfilled a requirement both from within the UK Health Security Agency (including Public Health England) and externally through SPI-M, to produce pandemic nowcasts and medium-term ‘projections’ in real-time. This was achieved through the PHE-Cambridge real-time model (RTM), a deterministic compartmental model designed to assimilate an array of pandemic data streams to produce epidemic nowcasts and forecasts, whilst estimating key epidemic quantities. Such models need to accurately capture fluctuations in transmission arising from external determinants (e.g. changing government advice, the emergence of new variants). Ignoring this “environmental stochasticity" can lead to mis-calibrated models that underestimate uncertainty and produce biased predictions. This stochasticity was injected in the current RTM by modelling the infection hazard, β_t, as a stochastic process, a discrete-time random-walk with coarse time-steps (Birrell et al, 2021).
This, however, was a pragmatic choice, designed, above all, to alleviate computational burden. Here we will present an alternative representation for β_t, exploiting an approximating (deterministic) pathwise-expansions of diffusion processes to replace the stochastic random-walk. The method is shown to capture the dynamics equally well with the potential for curbing the dimensionality of the posterior as data accumulate.
L Guzmann-RinconLink opens in a new window (University of Warwick) Bayesian estimation of the instant growth rate of SARS-CoV-2 positive cases in England, using Gaussian processes
The growth rate estimation of SARS-CoV-2 positive cases is crucial for understanding the evolution of the pandemic. We propose a method for estimating the growth rate of the proportion of positive cases in England and its local authorities. The proposed Bayesian model incorporates a Gaussian process as a latent effect, employed to compute the growth rate and higher derivatives. This method does not make assumptions about generation times and can be adapted to different spatial geographies and population subgroups. (Preprint: https://www.medrxiv.org/content/10.1101/2022.01.01.21268131v1.full).

15:30 - 17:00 : Contributed session on Informing Policy

R DeardonLink opens in a new window (University of Calgary) Identifying behavioural change mechanisms in epidemic models
The COVID-19 pandemic has illustrated both the utility and limitation of using epidemic models for understanding and forecasting disease spread. One of the many difficulties in modelling epidemic spread is that caused by behavioural change in the underlying population. This can be a major issue in public health since, as we have seen during the COVID-19 pandemic, behaviour in the population can change drastically as infection levels vary, both due to government mandates and personal decisions. Such changes in the underlying population result in major changes in transmission dynamics of the disease, making the modelling challenges. However, these issues arise in agriculture and public health, as changes in farming practice are also often observed as disease prevalence changes. We propose a model formulation where time-varying transmission is captured by the level of alarm in the population and specified as a function of the past epidemic trajectory. The model is set in a data-augmented Bayesian framework as epidemic data are often only partially observed, and we can utilize prior information to help with parameter identifiability. We investigate the estimability of the population alarm across a wide range of scenarios, using both parametric functions and non-parametric Gaussian process and splines. The benefit and utility of the proposed approach is illustrated through an application to COVID-19 data from New York City.
E SemonevaLink opens in a new window (University of Oxford) Spatial statistics with deep generative modelling: flexible and efficient disease mapping with MCMC and deep learning
Hierarchical Bayesian models are the current state-of-the-art approach to disease mapping. When working with areal type of data (e.g. district-level aggregates), routinely used models rely on the adjacency structure of areal units to account for spatial correlations. This approach ignores the continuous nature of spatial processes and is very rigid with respect to the change of support problem, i.e. when administrative boundaries change or when mapping needs to be done at a different administrative level. We present a novel, practical and easy to implement solution relying on a methodology combining deep generative modelling and fully Bayesian inference. We apply a recently proposed method of encoding spatial priors with Variational autoencoders (VAEs) to the change of support problem and malaria mapping. As malaria control programs continue to create novel control strategies, district level disease mapping remains a fundamental surveillance tool for analysing present and historical distribution of the disease. We encode realisations of the Gaussian Process prior over a fine artificial spatial grid, aggregated to the level of administrative boundaries, and use these VAE-priors at the inference stage. We demonstrate that the new method is faster and more efficient than state-of-the-art Bayesian models estimated via Markov Chain Monte Carlo algorithms.
A BeloconiLink opens in a new window (Swiss TPH) Malaria, climate variability and the effect of interventions: modelling transmission dynamics
Assessment of the relative impact of climate change on malaria dynamics is a complex problem. Climate is a well-known factor that plays a crucial role in driving malaria outbreaks in epidemic transmission areas. However, its influence in endemic environments with intensive malaria control interventions is not fully understood, mainly due to the scarcity of high-quality, long-term malaria data. The demographic surveillance systems in Africa offer unique platforms for quantifying the relative effects of weather variability on the burden of malaria. Here, using a process-based stochastic transmission model, we show that in the lowlands of malaria endemic western Kenya, variations in climatic factors played a key role in driving malaria incidence during 2008--2019, despite high bed net coverage and use among the population. Bayesian computation and inference based on pMCMC were compared to iterated filtering. The model accounts for the main mechanisms related to malaria dynamics, including immunity, infectivity, and human migration, and opens the possibility to forecast malaria in endemic regions, taking into account the interaction between future climatic conditions and intervention scenarios.

17:00 - 18:00 : Break

18:00 - 21:30 : Poster session

Day 3: Tuesday the 14th of March

9:10 - 10:10 : Invited session on Sampling from the Hidden states

C PooleyLink opens in a new window (Biomathematics and Statistics Scotland) Fast inference and model selection on epidemiolgical models using model-based proposals
Stochastic process-based models are used in a variety of fields, from epidemiology to biochemistry to quantitative genetics to finance. Performing Bayesian inference on these models has proved challenging, especially when the number of model parameters is large or the posterior is highly correlated.

A standard approach has been to use Data Augmentation Markov chain Monte Carlo (DA-MCMC). This, however, can suffer from three major shortcomings: 1) a poor initial chose for the chain can lead to the system becoming stuck in local minima, 2) a high degree of correlation between successive samples can lead to long computational times, and 3) reliable model selection is challenging. Widely used – and highly parallelisable - alternatives are Approximate Bayesian Computation (ABC) and sequential Monte Carlo variants (SMC-ABC), and in principle exact particle filtering methods such as particle MCMC (PMCMC). However, each these approaches has limitations which in practice can result in poor estimation of the posterior.

To move beyond these limitations this talk introduces two new inference algorithms: Approximate Bayesian Computation using MBPs (ABC-MBP) and Particle Annealed Sampling using MBPs (PAS-MBP). Both of these methods combine, in a novel way, two pre-existing ideas: firstly, using many “particles” (combinations of model parameters and system state) which, over successive generations, pass from the prior to the posterior, and secondly so-called “model-based proposals” (MBPs) originally designed for speeding up DA-MCMC mixing.

ABC-MBP and PAS-MBP differ in how they treat the observation model: ABC-MBP assumes a simple cut-off in an error function, whereas PAS-MBP allows for a flexible mechanistic observation model to be specified. In this talk PAS-MBP and ABC-MBP are compared against other methodologies in the literature (standard ABC, ABC-SMC, PMCMC and MC³) using a number of benchmark epidemiological models (ranging in complexity and level of stochasticity). The new methods are found to come out fastest, as well as being easily parallelisable and posing fewer problems in terms of optimisation.

ABC-MBP is applied to an age-stratified compartmental model using publicly available COVID-19 for England and Wales. This represents a challenging problem (fitting ~150 model variables using ~9000 observations) which yielded improved estimation of the age-stratified contact matrix as well as time variation in the reproduction number.
J XuLink opens in a new window (Duke University) Efficient Branching Process Proposals and Data-Augmented MCMC for the Stochastic SIR Model
We propose a novel data-augmented Markov Chain Monte Carlo algorithm for exact Bayesian inference under the stochastic susceptible-infected-removed model, given only discretely observed counts of infection. Incidence data in our setting present challenges to inference due to only partially informing us of the underlying continuous-time proces. To account for the missing data while targeting the exact posterior of model parameters, we make use of latent variables that re jointly proposed from a surrogate process carefully designed to closely resemble the SIR model. This allows us to efficiently generate epidemics consistent with the observed data, and extends to non-Markovian settings. Our Markov chain Monte Carlo algorithm is shown to be uniformly ergodic, and we find that it mixes significantly faster than existing single-site samplers on several real and simulated data applications.

10:10 - 10:30 : Coffee Break

10:30 - 12:00 : Keynotes

S CauchemezLink opens in a new window (Institute Pasteur) Bayesian data augmentation methods applied to infectious disease epidemiology
In this talk, I will discuss how Bayesian data augmentation approaches have been used to analyze complex infectious disease datasets and gain key insights on transmission dynamics, correlates of protection, identification of unobserved infections, evolution of key biomarkers, etc... The talk will be illustrated with analyses that considered different types of epidemiological datasets including outbreak investigations, cross-sectional serological studies and longitudinal cohorts, as well as different pathogens (influenza, dengue, COVID-19).
P NouvelletLink opens in a new window (University of Sussex) Understanding and quantifying pathogen transmission, power of Bayesian approaches
Bayesian approaches have become increasingly popular in epidemiology for estimating key parameters of infectious disease transmission, such as the reproduction number. The basic reproduction number, commonly denoted as R0, measures the average number of secondary infections caused by a single infected individual. R0 can be generalised to an effective reproduction number, noted Rt, estimated at any specific point in time, given the current conditions (e.g., population immunity, control measures in place).
In this presentation, I will show how, using a Bayesian approach, we have extended a popular framework to estimate Rt to 1) disentangle the impact of importation from local transmission, 2) understand and quantify the drivers of transmission, 3) estimate the potential transmission advantage of emerging variants. Overall, Bayesian approaches offer a powerful framework for measuring the reproduction number of pathogens and understanding the complex dynamics of infectious disease transmission. These methods can help public health officials make informed decisions and develop effective strategies for controlling and preventing the spread of disease.

12:00 - 15:00 : Lunch and Ski break

15:00 - 16:30 : Invited session on Inference of nonlinear dynamics

L RimellaLink opens in a new window (University of Lancaster) Approximating optimal SMC proposal distributions in individual-based epidemic models
Many epidemic models are naturally defined as individual-based models: where we track the state of each individual within a susceptible population. Inference for individual-based models is challenging due to the high-dimensional state-space of such models, which increases exponentially with population size. We consider sequential Monte Carlo algorithms for inference for individual-based epidemic models where we make direct observations of the state of a sample of individuals. Standard implementations, such as the bootstrap filter or the auxiliary particle filter are inefficient due to mismatch between the proposal distribution of the state and future observations. We develop new efficient proposal distributions that take account of future observations, leveraging the properties that (i) we can analytically calculate the optimal proposal distribution for a single individual given future observations and the future infection rate of that individual; and (ii) the dynamics of individuals are independent if we condition on their infection rates. Thus we construct estimates of the future infection rate for each individual, and then use an independent proposal for the state of each individual given this estimate. Empirical results show order of magnitude improvement in efficiency of the sequential Monte Carlo sampler for both SIS and SEIR models.
M WhitehouseLink opens in a new window (University of Bristol) Consistent and fast inference in compartmental models of epidemics using Poisson Approximate Likelihoods
Addressing the challenge of scaling-up epidemiological inference to complex and heterogeneous models, we introduce Poisson Approximate Likelihood (PAL) methods. In contrast to the popular ODE approach to compartmental modelling, in which a large population limit is used to motivate a deterministic model, PALs are derived from approximate filtering equations for finite-population, stochastic compartmental models, and the large population limit drives the consistency of maximum PAL estimators. Our theoretical results appear to be the first likelihood-based parameter estimation consistency results applicable across a broad class of partially observed stochastic compartmental models concerning the large population limit. Compared to simulation-based methods such as Approximate Bayesian Computation and Sequential Monte Carlo, PALs are simple to implement, involving only elementary arithmetic operations and no tuning parameters; and fast to evaluate, requiring no simulation from the model and having computational cost independent of population size. Through examples, we demonstrate how PALs can be: embedded within Delayed Acceptance Particle Markov Chain Monte Carlo to facilitate Bayesian inference; used to fit an age-structured model of influenza, taking advantage of automatic differentiation in Stan; and applied to calibrate a spatial meta-population model of measles.
J WheelerLink opens in a new window (University of Michigan) Informing policy via dynamic models: Cholera in Haiti

Policy decisions related to an infectious disease outbreak are often informed by incidence data and scientifically motivated dynamic models. The development of useful models requires addressing the tradeoff between biological fidelity and model simplicity, and the reality of misspecification for models at all levels of complexity. As a case study, we consider the 2010-2019 cholera epidemic in Haiti. We study three dynamic models developed by expert teams to advise on vaccination policies. We assess previous methods used for fitting and evaluating these models, and we develop data analysis strategies leading to improved statistical fit. Specifically, we present approaches to diagnosis of model misspecification, development of alternative models, and computational improvements in optimization, in the context of likelihood-based inference on nonlinear dynamic systems. Our workflow is reproducible and extendable, facilitating future investigations of this disease system.

16:30 - 17:00 : Coffee Break

17:00 - 18:30 : Invited session on Phylogenetic inference

P MarttinenLink opens in a new window (Aalto University) A Bayesian model of acquisition and clearance of bacterial colonization incorporating within-host variation
Bacterial populations that colonize a host can play important roles in host health, including serving as a reservoir that transmits to other hosts and from which invasive strains emerge, thus emphasizing the importance of understanding rates of acquisition and clearance of colonizing populations. Studies of colonization dynamics have been based on assessment of whether serial samples represent a single population or distinct colonization events. With the use of whole genome sequencing to determine genetic distance between isolates, a common solution to estimate acquisition and clearance rates has been to assume a fixed genetic distance threshold below which isolates are considered to represent the same strain. However, this approach is often inadequate to account for the diversity of the underlying within-host evolving population, the time intervals between consecutive measurements, and the uncertainty in the estimated acquisition and clearance rates. Here, we present a fully Bayesian model that provides probabilities of whether two strains should be considered the same, allowing us to determine bacterial clearance and acquisition from genomes sampled over time. Our method explicitly models within-host variation using population genetic simulation, and the inference is done using a combination of Approximate Bayesian Computation (ABC) and Markov Chain Monte Carlo (MCMC). We validate the method with multiple carefully conducted simulations and demonstrate its use in practice by analyzing a collection of methicillin resistant Staphylococcus aureus (MRSA) isolates from a large longitudinal clinical study.
J KoskelaLink opens in a new window (Univeristy of Warwick) Bayesian inference of recombinant ancestries
DNA sequences are correlated due to common ancestry among individuals. In most cases ancestral relations and DNA sequences cannot be observed, necessitating a mathematical model for latent ancestries. The so-called coalescent with recombination, or CwR, is a gold-standard model in the genome-scale setting, where ancestral trees of different genes differ due to a biological process called recombination. However, imputing missing ancestries under the CwR is notoriously computationally expensive. I will introduce the CwR model, and show how recent progress in data structures for storing CwR realisations clarifies the exact reason for the computational bottleneck. I'll also demonstrate that despite their cost, MCMC algorithms for the CwR provide useful benchmarks for assessing biases and uncertainty in more scalable methods.
A GillLink opens in a new window (University of Warwick) Bayesian Inference of Reproduction Number from Genomic and Epidemic Data using MCMC Methods
The reproduction number R(t) represents the average number of new infections caused by a single infected individual at time t. Estimation of the reproduction number R(t) is of vital importance during an epidemic outbreak, for example, to decide whether to implement control measures and to determine their effects once implemented. Typically, the reproduction number R(t) is inferred using only epidemic data, such as prevalence per day. However, prevalence data is often noisy, partially observed and biased. Genomic data is therefore increasingly being used to understand infectious disease epidemiology.
We take a Bayesian approach to this problem to find the trajectory of R(t) given a dated phylogeny and partial prevalence data using particle Markov chain Monte Carlo methods. We have implemented a particle marginal Metropolis–Hastings algorithm with backward simulation to jointly infer the hyper-parameters of the model, the latent epidemic and the trajectory of R(t). The performance of this approach is analysed using simulated data. These simulations show that incorporating genomic data as well as epidemic data improves inference in a variety of cases.