# Algorithms & Computationally Intensive Inference seminars

**Terms 1-3, Location MB2.23, Fridays 12:15-14:00 (12:15-12:45 is an informal sandwich lunch). **

*Reminder emails are not sent to participants unless there is a change to the scheduled programme at short notice. If you would like to speak, or you want to be included in any emails, please contact one of the organisers.*

**Current Organisers**: Dootika Vats,

*If you would like to talk, or have ideas for possible speakers, then please email one of the organisers above.*

**Website URL:** www.warwick.ac.uk/compstat

**Mailing List Sign-Up:** http://mailman1.csv.warwick.ac.uk/mailman/listinfo/algorithmseminar

**Mailing List:** algorithmseminar@listserv.csv.warwick.ac.uk **(NB - only approved members can post)**

**2018/19 Term 1:**

- Week 1 - 5th October - Cancelled
- Week 2 - 12th October - Summary of research short talks:
- Week 3 - 19th October - Summary of research short talks:

- Week 4 - 26th October - Giulio Morina (Warwick) - "From the Bernoulli Factory to a Dice Enterprise via Perfect Sampling of Markov Chains"
- Abstract: Given a p-coin that lands heads with unknown probability p, we wish to construct an algorithm that produces an f(p)-coin for a given function f:(0, 1)→(0, 1). This problem is commonly known as the Bernoulli

Factory and generic ways to design a practical algorithm for a given function f exist only in a few special cases. We present a constructive way to build an efficient Bernoulli Factory when f(p) is a rational function. Moreover, we extend the original problem to a more general setting where we have access to an m-sided

die and we wish to roll a v-sided one, where the probability of rolling each face is a fixed function of the original probabilties. We achieve this by perfectly simulating from the stationary distribution of a

certain class of Markov chains.

- Abstract: Given a p-coin that lands heads with unknown probability p, we wish to construct an algorithm that produces an f(p)-coin for a given function f:(0, 1)→(0, 1). This problem is commonly known as the Bernoulli
- Week 5 - 2nd November - Cancelled. CoSInES Launch Day.

- Week 6 - 9th November - Iker Perez (Nottingham) - "Novel approaches to efficiently augment a Markov Jump process for exact Bayesian Inference"
- Abstract: In this talk we will discuss foundational statistical challenges associated with families of Markovian jump models, which often find applications in domains such as genetics, epidemiology, mathematical biology or queueing theory. We will first review Markov jump processes, and by means of common accessible examples, illustrate the computational impediments posed by real-world application scenarios to inverse uncertainty quantification tasks. Then, we will give an overview of the recent advances linked to structured jump systems. Our work is concerned with building on uniformization procedures and we propose a novel efficient auxiliary-variable algorithm for data augmentation, which yields computationally tractable distributions suited for exact (in Monte Carlo sense) Bayesian inference in often large, infinite or multivariate population systems. We demonstrate the capabilities of the presented methods by drawing Bayesian inference for partially observed stochastic epidemics and show that it overcomes the limitations of existing vanilla approaches. This is joint work (in progress) with Theo Kypraios.

- Week 7 - 16th November - Ritabrata Dutta (Warwick) - "Well-Tempered Hamiltonian Monte Carlo on Active-Space"
- Abstract: When the gradient of the log-target distribution is available, Hamiltonian Monte

Carlo (HMC) has been proved to be an efficient simulation algorithm. However,

HMC performs poorly when the target is high-dimensional and has multiple isolated

modes. To alleviate these problems we propose to perform HMC on a locally and

continuously tempered target distribution. This tempering is based on an efficient

approach to simulate molecular dynamics in high-dimensional space, known as well-

tempered meta-dynamics. The tempering we suggest is performed locally and only

along the directions of the maximum changes in the target which we identify as

the active space of the target. The active space is the span of the eigenfunctions

corresponding to the dominant eigenvalues of the expected Hessian matrix of the

log-target. To capture the state dependent non-linearity of the target, we iteratively

estimate the active space from the most recent batch of samples obtained from the

simulation. Finally, we suggest a re-weighting scheme based on path-sampling to

provide importance weights for the samples drawn from the continuously-tempered

distribution. We illustrate the performance of this scheme for target distributions

with complex geometry and multiple modes on high-dimensional spaces in comparison

with traditional HMC with No-U-Turn-Sampler.

- Abstract: When the gradient of the log-target distribution is available, Hamiltonian Monte
- Week 8 - 23rd November - Stephen Connor (York) - "Omnithermal Perfect Simulation for Multi-server Queues"
- Abstract: The last few years have seen much progress in the area of perfect simulation algorithms for multi-server queueing systems, allowing us to sample from the exact equilibrium distribution of the Kiefer-Wolfowitz workload vector. This talk will describe an “omnithermal" variant of one such algorithm for M/G/c queues, which permits simultaneous sampling from the equilibrium distributions for a range of c (the number of servers) at relatively little additional cost.

- Week 9 - 30th November - Emilia Pompe (Oxford) - "Adaptive MCMC for Multimodal Distributions"
- Abstract: We propose a new Monte Carlo method for sampling from multimodal distributions (Jumping Adaptive Multimodal Sampler). The idea of this technique is based on splitting the task into two: finding the modes of the target distribution and sampling, given the knowledge of the locations of the modes. The sampling algorithm is based on steps of two types: local ones, preserving the mode, and jumps to a region associated with a different mode. Besides, the method learns the optimal parameters while it runs, without requiring user intervention. The main properties of our algorithm will be discussed and its performance will be illustrated with several examples of multimodal target distributions. Some ergodic results that we proved for this method will also be presented. This is joint work with Chris Holmes and Krys Latuszynski.

- Week 10 - 7th December - Flávio Gonçalves (UFMG) - "Exact Bayesian inference for Level-Set Cox Processes"
- Abstract: We consider a Level-Set spatial Cox process which assumes the intensity function to be piece-wise constant. The Level-Set approach considers the levels of a latent Gaussian Process to define the IF contours in a continuous and flexible way. This is an infinite dimensional model for which existing inference solutions rely on discrete approximations. This introduces a significant bias to the estimation procedure and, often, model decharacterisation. Attempts to mitigate the approximation problems inevitably lead to impractical computational costs. We propose a novel pseudo-marginal MCMC algorithm that has the exact posterior distribution as the target. The likelihood function estimator for the pseudo-marginal algorithm is devised through a Poisson estimator in which a non-centred parametrisation plays an important roll.

**2018/19 Term 2:**

- Week 1 - 11th January- Neil Chada (NUS) - "Title: Posterior convergence analysis of $\alpha$-stable processes: applications in Bayesian inversion"
- Abstract: This talk is concerned with the theoretical understanding of $\alpha$-stable sheets ${X}$. Our motivation for this is in the context of Bayesian inverse problems, where we consider the treatment of these processes as prior forms for parameter estimation. We derive various convergence results of these processes. In doing so we use a number of variants which these sheets can take, such as a stochastic integral representation, but also random series expansions through Poisson processes. Our convergence analysis will rely on the fact of whether ${X}$ omits $L^p$ sample paths, and if so how regular the paths are.

- Week 2 - 18th January - James Flegal (UC Riverside)
**(in room MB2.22)**- "Weighted batch means estimators in Markov chain Monte Carlo"- Abstract: We propose a family of weighted batch means variance estimators, which are computationally efficient and can be conveniently applied in practice. The focus is on Markov chain Monte Carlo simulations and estimation of the asymptotic covariance matrix in the Markov chain central limit theorem, where conditions ensuring strong consistency are provided. Finite sample performance is evaluated through auto-regressive, Bayesian spatial-temporal, and Bayesian logistic regression examples, where the new estimators show significant computational gains with a minor sacrifice in variance compared with existing methods.

- Week 3 - 25th January - Joint Session (Short Talks)
- Jure Vogrinc (Warwick) - “Skipping MCMC: A new approach to sampling tail events”
- Abstract: We will focus on the model problem of sampling from a given target distribution conditioned on the samples being outside of a given “common” set. This is of potential interest in a wide range of applications where a rare or atypical event needs to be understood. Standard MCMC methods may struggle in this setting as the resulting conditional target can easily be multi-modal and the MCMC method may struggle to cross the common set. I will present a modification, called the Skipping Random walk Metropolis, of a “parent” RWM, that instead of automatically rejecting the proposals into the common set tries to skip across it. It can be shown that under very mild conditions the Skipping RWM is actually just a RWM with a different proposal density and sufficient conditions for ergodicity and CLT are inherited from the parent RWM. Per step, Skipping RWM mixes at least as fast as the parent RWM and is also asymptotically at least as efficient. I will show some toy examples and discuss method’s applicability and the connections to other MCMC methods.

- Georgios Vasdekis (Warwick) - "A generalisation of the ZigZag process"
- Abstract: Piecewise Deterministic Markov Processes have recently drawn the attention of MCMC community. The first reason for that is that one can simulate exactly the entire path of such a process. The second is that these processes are non-reversible, which sometimes leads to quicker mixing. In 2016, in Bierkens and Roberts, one of these Processes, the ZigZag process was introduced. This process moves linearly in space in specific directions for a random period of time and then it changes direction. However, the original directions were only allowed to be of the form {-1,+1}^d. In this talk I will explain how one can extend this process to more directions. I will give some examples where such a generalisation could be useful, but I will, also, give some heuristic reasons why one should not introduce too many directions. I will, then, present results involving the ergodicity of this process, which extends to geometric ergodicity when the target distribution has light tails. Time permitting, I will sketch why one cannot have geometric ergodicity in the case of heavy tail target.

- Jure Vogrinc (Warwick) - “Skipping MCMC: A new approach to sampling tail events”
- Week 4 - 1st February - Mateusz Majka (Warwick) - "Sampling from invariant measures of SDEs: Non-asymptotic analysis via coupling"
- Abstract: We consider the problem of approximate sampling from invariant measures of Langevin stochastic differential equations. We first discuss a general method of analysing convergence of Euler discretisations of such equations in the Wasserstein distances, based on the coupling technique. In particular, we develop a way to control the distance between an Euler scheme and its perturbed version. We show how to apply our perturbation result to study the Unadjusted Langevin Algorithm (ULA) for approximate sampling from non-log-concave measures. We also consider its counterpart based on the Stochastic Gradient Langevin Dynamics (SGLD). Finally, we discuss how our techniques apply to the Multi-level Monte Carlo (MLMC) method for Euler schemes. The talk is based on joint work with Aleksandar Mijatovic (Warwick) and Lukasz Szpruch (Edinburgh).

- Week 5 - 8th February - Ioannis Kosmidis (Warwick) - "Towards the finite-sample variance bounds for unbiased estimators"
- Abstract: The inverse of the Fisher information matrix in a likelihood problem is i) the variance-covariance matrix of the asymptotic distribution of the maximum likelihood (ML) estimator ii) the dominant term in the expansion of the finite-sample variance of the ML estimator, and iii) the "lowest" achievable variance-covariance that an unbiased estimator can achieve, where "lowest" here indicates that its difference from the variance of any unbiased estimator is a positive definite matrix. These three characterizations and the asymptotic unbiasedness of the ML estimator are key justifications for the wide-spread use of the latter in statistical practice. For example, standard regression software typically reports the ML estimates alongside with estimated standard errors coming from the inversion of the Fisher information matrix at the estimates. Nevertheless, the use of that pair of estimates and estimated standard errors for inference implicitly assumes, amongst other things, that the information about the parameters in the sample is large enough for the estimator to be almost unbiased and its variance to be well-approximated by the inverse of the Fisher information matrix. In this talk, we present results from work-in-progress that aims to bridge that finite-sample gap between estimates and the estimated variance-covariance matrix. Specifically, we introduce a novel family of estimators that not only have the same limiting optimality properties as the ML estimator (consistency and asymptotic normality, unbiasedness and efficiency), but also have finite sample variance that is asymptotically closer to the inverse of the Fisher information matrix than the variance of the ML estimator is. We illustrate the properties of the proposed estimators in some well-used inferential settings.

- Week 6 - 15th February - Matthew Ludkin (Lancaster) - Title:"Hug 'N' Hop: Explicit, non-reversible, contour-hugging MCMC"
- Abstract: Both the Bouncy Particle Sampler (BPS) and the Discrete Bouncy Particle Sampler (DBPS) are non-reversible Markov chain Monte Carlo algorithms whose action can be visualised in terms of a particle moving with a fixed-magnitude velocity. Both algorithms include an occasional step where the particle `bounces' off a hyperplane which is tangent to the gradient of the target density, making the BPS rejection-free and allowing the DBPS to propose relatively large jumps whilst maintaining a high acceptance rate. Analogously to the concatenation of leapfrog steps in HMC, we describe an algorithm which omits the straight-line movement

of the BPS and DBPS and, instead, at each iteration concatenates several discrete `bounces' to provide a proposal which is on almost the same target contour as the starting point, producing a large proposed move

with a high acceptance probability. Combined with a separate kernel designed for moving between contours, an explicit bouncing scheme which takes account of the local Hessian at each bounce point ensures that the

proposal respects the local geometry of the target, and leads to an efficient, skew-reversible MCMC algorithm.

- Abstract: Both the Bouncy Particle Sampler (BPS) and the Discrete Bouncy Particle Sampler (DBPS) are non-reversible Markov chain Monte Carlo algorithms whose action can be visualised in terms of a particle moving with a fixed-magnitude velocity. Both algorithms include an occasional step where the particle `bounces' off a hyperplane which is tangent to the gradient of the target density, making the BPS rejection-free and allowing the DBPS to propose relatively large jumps whilst maintaining a high acceptance rate. Analogously to the concatenation of leapfrog steps in HMC, we describe an algorithm which omits the straight-line movement
- Week 7 - 22nd February - Patrick Rebeschini (Oxford) - "On the Interplay between Statistics, Computation and Communication in Decentralised Learning."
- Abstract: Motivated by bandwidth limitations, privacy concerns and network instability, a large body of literature has been investigating the performance of decentralised optimisation methods to fit statistical models on datasets stored across multiple machines where there is no central server to coordinate computation. In this literature, data is typically treated as deterministic and one is interested in controlling the optimisation error. In this talk, we take a statistical point of view and assume that data come from the same unknown probability distribution. We consider the decentralised version of two of the most well-studied algorithmic paradigms for serial statistical optimisation in offline and online learning: gradient descent and upper confidence bound (UCB). For decentralised gradient descent in non-parametric regression, we investigate the choice of step-size, stopping time, and communication topology that allows to recover optimal statistical rates with respect to the entire data on the network. We show that the choice of the communication graph can be considered as a regulariser, and that more statistics allows for less computation and less communication. (based on joint work with D. Richards)

For decentralised UCB in multi-armed bandit problems, we show that network-dependent delayed actions are key to obtain improved regret bounds as a function of the graph topology. (based on joint work D. Martínez-Rubio and V. Kanade)

- Abstract: Motivated by bandwidth limitations, privacy concerns and network instability, a large body of literature has been investigating the performance of decentralised optimisation methods to fit statistical models on datasets stored across multiple machines where there is no central server to coordinate computation. In this literature, data is typically treated as deterministic and one is interested in controlling the optimisation error. In this talk, we take a statistical point of view and assume that data come from the same unknown probability distribution. We consider the decentralised version of two of the most well-studied algorithmic paradigms for serial statistical optimisation in offline and online learning: gradient descent and upper confidence bound (UCB). For decentralised gradient descent in non-parametric regression, we investigate the choice of step-size, stopping time, and communication topology that allows to recover optimal statistical rates with respect to the entire data on the network. We show that the choice of the communication graph can be considered as a regulariser, and that more statistics allows for less computation and less communication. (based on joint work with D. Richards)
- Week 8 - 1st March - Susana Gomes (Warwick) - "Parameter estimation for pedestrian dynamics models"
- Abstract: In this talk we present a framework for estimating parameters in macroscopic models for crowd dynamics using data from individual trajectories. We consider a model for the unidirectional flow of pedestrians in a corridor which consists of a coupling between a density dependent stochastic differential equation and a nonlinear partial differential equation for the density. In the stochastic differential equation for the trajectories, the velocity of a pedestrian decreases with the density according to the fundamental diagram. Although there is a general agreement on the basic shape of this dependence, its parametrization depends strongly on the measurement and averaging techniques used as well as the experimental setup considered. We will discuss identifiability of the parameters appearing in the fundamental diagram, introduce optimisation and Bayesian methods to perform the identification, and analyse the performance of the proposed methodology in various realistic situations. Finally, we discuss possible generalisations, including the effect of the form of the fundamental diagram and the use of experimental data.

- Abstract: In this talk we present a framework for estimating parameters in macroscopic models for crowd dynamics using data from individual trajectories. We consider a model for the unidirectional flow of pedestrians in a corridor which consists of a coupling between a density dependent stochastic differential equation and a nonlinear partial differential equation for the density. In the stochastic differential equation for the trajectories, the velocity of a pedestrian decreases with the density according to the fundamental diagram. Although there is a general agreement on the basic shape of this dependence, its parametrization depends strongly on the measurement and averaging techniques used as well as the experimental setup considered. We will discuss identifiability of the parameters appearing in the fundamental diagram, introduce optimisation and Bayesian methods to perform the identification, and analyse the performance of the proposed methodology in various realistic situations. Finally, we discuss possible generalisations, including the effect of the form of the fundamental diagram and the use of experimental data.
- Week 9 - 8th March - Joint Session (Short Talks)
- Reece Mears (Warwick) - "Cracking ciphers using simulation"
- Abstract: The need to protect secrets via encryption has existed for millennia, and for just as long the desire to intercept and decode such secrets has been its counterpart. Traditionally, cryptanalysis has been a tedious, manual process, but with recent advances in computing, new methods are being developed that require little human input. This talk will assess Markov chain Monte Carlo in its ability to decode simple substitution and transposition ciphers. A new implementation for transposition ciphers is proposed, and is found to outperform the existing algorithm by all accounts. We then take a fresh approach into more complicated ciphers, a mostly unexplored area for MCMC algorithms, including one of the most notorious polyalphabetic ciphers ever conceived: the Vigenère cipher. In each case, the algorithms developed are found to be successful in decoding the ciphertexts to a human-readable standard, all the while requiring far less parameter tuning than other existing approaches. This adaptability may be the key for attacking ciphers given less knowledge of the encryption technique a priori.

- Ryan Chan (Warwick) - "Hierarchical Monte Carlo Fusion"
- Abstract: Monte Carlo Fusion proposes a new theory and methodology to tackle the problem of unifying distributed analyses and inferences on shared parameters from multiple sources, into a single coherent inference. This problem can appear in settings such as expert elicitation, distributed ‘big data’ problems, and tempering. ‘Hierarchical Monte Carlo Fusion’ builds upon the Monte Carlo Fusion algorithm (Dai, Pollock & Roberts 2018) which uses a ‘fork-and-join’ (or ‘divide-and-conquer’) approach, and can be useful when the number of sub-posteriors to be unified is large. We use a tempering example to illustrate the extension.

- Reece Mears (Warwick) - "Cracking ciphers using simulation"
- Week 10 - 15th March - Jake Carson (Warwick) - "Bayesian Model Selection for Infectious Disease Models"
- Abstract: Infectious disease models typically contain few model parameters, but inference for these models is challenging owing to large amounts of missing information, such as the infection and recovery times of individuals. When the full conditional distribution of the hidden infection process is available, such as in discrete time epidemic models, effective approaches exist for estimating the model evidence to a high precision. These approaches make use of the forward filtering backward sampling (FFBS) algorithm to impute the hidden infection process. Since the computational cost of the FFBS algorithm grows exponentially with the number of individuals, these approaches are only tractable when analysing small populations. Recently proposed variants of the FFBS algorithm reduce the computational complexity of imputing the hidden process to linear in the number of individuals, but do not directly sample the hidden infection process from its full conditional distribution. We demonstrate how these developments can be used to form effective proposal distributions for estimation of the model evidence when studying larger populations.

**2018/19 Term 3:**

- Week 1 - 26th April - Toni Karvonen (Aalto University) - Title / Abstract TBA

- Week 2 - 3rd May - Leah South (Lancaster) - Title / Abstract TBA

- Week 3 - 10th May - Short Talks
- Dootika Vats (Warwick) - Title / Abstract TBA

- Omer Deniz Akyildiz (Warwick) Title / Abstract TBA

- Dootika Vats (Warwick) - Title / Abstract TBA
- Week 4 - 17th May - Jonathan Harrison (Warwick) - Title / Abstract TBA
- Week 5 - 24th May - Jeremie Houssineau (Warwick) - Title / Abstract TBA
- Week 6 - 31st May - Jorge I. González Cázares (Warwick) - Title / Abstract TBA
- Week 7 - 7th June - Sam Power (Cambridge) - Title / Abstract TBA
- Week 8 - 14th June - Daniel Paulin (Oxford) - Title / Abstract TBA
- Week 9 - 21st June - Short Talks
- Daniel Tait (Warwick) - Title / Abstract
- Kangrui Wang (Warwick) - Title / Abstract

- Week 10 - 28th June - Alice Corbella (Warwick) - Title / Abstract TBA

**Previous Years:**

**Some key phrases**:

- Sampling and inference for diffusions

- Exact algorithms

- Intractable likelihood

- Pseudo-marginal algorithms

- Particle filters

- Importance sampling

- MCMC

- Adaptive MCMC

- Perfect simulation

- Markov chains...

- Random structures...

- Randomised algorithms...