2018-2019
2018/19 Term 1:
- Week 1 - 5th October - Cancelled
- Week 2 - 12th October - Summary of research short talks:
- Week 3 - 19th October - Summary of research short talks:
- Week 4 - 26th October - Giulio Morina (Warwick) - "From the Bernoulli Factory to a Dice Enterprise via Perfect Sampling of Markov Chains"
- Abstract: Given a p-coin that lands heads with unknown probability p, we wish to construct an algorithm that produces an f(p)-coin for a given function f:(0, 1)→(0, 1). This problem is commonly known as the Bernoulli
Factory and generic ways to design a practical algorithm for a given function f exist only in a few special cases. We present a constructive way to build an efficient Bernoulli Factory when f(p) is a rational function. Moreover, we extend the original problem to a more general setting where we have access to an m-sided
die and we wish to roll a v-sided one, where the probability of rolling each face is a fixed function of the original probabilties. We achieve this by perfectly simulating from the stationary distribution of a
certain class of Markov chains.
- Abstract: Given a p-coin that lands heads with unknown probability p, we wish to construct an algorithm that produces an f(p)-coin for a given function f:(0, 1)→(0, 1). This problem is commonly known as the Bernoulli
- Week 5 - 2nd November - Cancelled. CoSInES Launch Day.
- Week 6 - 9th November - Iker Perez (Nottingham) - "Novel approaches to efficiently augment a Markov Jump process for exact Bayesian Inference"
- Abstract: In this talk we will discuss foundational statistical challenges associated with families of Markovian jump models, which often find applications in domains such as genetics, epidemiology, mathematical biology or queueing theory. We will first review Markov jump processes, and by means of common accessible examples, illustrate the computational impediments posed by real-world application scenarios to inverse uncertainty quantification tasks. Then, we will give an overview of the recent advances linked to structured jump systems. Our work is concerned with building on uniformization procedures and we propose a novel efficient auxiliary-variable algorithm for data augmentation, which yields computationally tractable distributions suited for exact (in Monte Carlo sense) Bayesian inference in often large, infinite or multivariate population systems. We demonstrate the capabilities of the presented methods by drawing Bayesian inference for partially observed stochastic epidemics and show that it overcomes the limitations of existing vanilla approaches. This is joint work (in progress) with Theo Kypraios.
- Week 7 - 16th November - Ritabrata Dutta (Warwick) - "Well-Tempered Hamiltonian Monte Carlo on Active-Space"
- Abstract: When the gradient of the log-target distribution is available, Hamiltonian Monte
Carlo (HMC) has been proved to be an efficient simulation algorithm. However,
HMC performs poorly when the target is high-dimensional and has multiple isolated
modes. To alleviate these problems we propose to perform HMC on a locally and
continuously tempered target distribution. This tempering is based on an efficient
approach to simulate molecular dynamics in high-dimensional space, known as well-
tempered meta-dynamics. The tempering we suggest is performed locally and only
along the directions of the maximum changes in the target which we identify as
the active space of the target. The active space is the span of the eigenfunctions
corresponding to the dominant eigenvalues of the expected Hessian matrix of the
log-target. To capture the state dependent non-linearity of the target, we iteratively
estimate the active space from the most recent batch of samples obtained from the
simulation. Finally, we suggest a re-weighting scheme based on path-sampling to
provide importance weights for the samples drawn from the continuously-tempered
distribution. We illustrate the performance of this scheme for target distributions
with complex geometry and multiple modes on high-dimensional spaces in comparison
with traditional HMC with No-U-Turn-Sampler.
- Abstract: When the gradient of the log-target distribution is available, Hamiltonian Monte
- Week 8 - 23rd November - Stephen Connor (York) - "Omnithermal Perfect Simulation for Multi-server Queues"
- Abstract: The last few years have seen much progress in the area of perfect simulation algorithms for multi-server queueing systems, allowing us to sample from the exact equilibrium distribution of the Kiefer-Wolfowitz workload vector. This talk will describe an “omnithermal" variant of one such algorithm for M/G/c queues, which permits simultaneous sampling from the equilibrium distributions for a range of c (the number of servers) at relatively little additional cost.
- Week 9 - 30th November - Emilia Pompe (Oxford) - "Adaptive MCMC for Multimodal Distributions"
- Abstract: We propose a new Monte Carlo method for sampling from multimodal distributions (Jumping Adaptive Multimodal Sampler). The idea of this technique is based on splitting the task into two: finding the modes of the target distribution and sampling, given the knowledge of the locations of the modes. The sampling algorithm is based on steps of two types: local ones, preserving the mode, and jumps to a region associated with a different mode. Besides, the method learns the optimal parameters while it runs, without requiring user intervention. The main properties of our algorithm will be discussed and its performance will be illustrated with several examples of multimodal target distributions. Some ergodic results that we proved for this method will also be presented. This is joint work with Chris Holmes and Krys Latuszynski.
- Week 10 - 7th December - Flávio Gonçalves (UFMG) - "Exact Bayesian inference for Level-Set Cox Processes"
- Abstract: We consider a Level-Set spatial Cox process which assumes the intensity function to be piece-wise constant. The Level-Set approach considers the levels of a latent Gaussian Process to define the IF contours in a continuous and flexible way. This is an infinite dimensional model for which existing inference solutions rely on discrete approximations. This introduces a significant bias to the estimation procedure and, often, model decharacterisation. Attempts to mitigate the approximation problems inevitably lead to impractical computational costs. We propose a novel pseudo-marginal MCMC algorithm that has the exact posterior distribution as the target. The likelihood function estimator for the pseudo-marginal algorithm is devised through a Poisson estimator in which a non-centred parametrisation plays an important roll.
2018/19 Term 2:
- Week 1 - 11th January- Neil Chada (NUS) - "Title: Posterior convergence analysis of $\alpha$-stable processes: applications in Bayesian inversion"
- Abstract: This talk is concerned with the theoretical understanding of $\alpha$-stable sheets ${X}$. Our motivation for this is in the context of Bayesian inverse problems, where we consider the treatment of these processes as prior forms for parameter estimation. We derive various convergence results of these processes. In doing so we use a number of variants which these sheets can take, such as a stochastic integral representation, but also random series expansions through Poisson processes. Our convergence analysis will rely on the fact of whether ${X}$ omits $L^p$ sample paths, and if so how regular the paths are.
- Week 2 - 18th January - James Flegal (UC Riverside) (in room MB2.22) - "Weighted batch means estimators in Markov chain Monte Carlo"
- Abstract: We propose a family of weighted batch means variance estimators, which are computationally efficient and can be conveniently applied in practice. The focus is on Markov chain Monte Carlo simulations and estimation of the asymptotic covariance matrix in the Markov chain central limit theorem, where conditions ensuring strong consistency are provided. Finite sample performance is evaluated through auto-regressive, Bayesian spatial-temporal, and Bayesian logistic regression examples, where the new estimators show significant computational gains with a minor sacrifice in variance compared with existing methods.
- Week 3 - 25th January - Joint Session (Short Talks)
- Jure Vogrinc (Warwick) - “Skipping MCMC: A new approach to sampling tail events”
- Abstract: We will focus on the model problem of sampling from a given target distribution conditioned on the samples being outside of a given “common” set. This is of potential interest in a wide range of applications where a rare or atypical event needs to be understood. Standard MCMC methods may struggle in this setting as the resulting conditional target can easily be multi-modal and the MCMC method may struggle to cross the common set. I will present a modification, called the Skipping Random walk Metropolis, of a “parent” RWM, that instead of automatically rejecting the proposals into the common set tries to skip across it. It can be shown that under very mild conditions the Skipping RWM is actually just a RWM with a different proposal density and sufficient conditions for ergodicity and CLT are inherited from the parent RWM. Per step, Skipping RWM mixes at least as fast as the parent RWM and is also asymptotically at least as efficient. I will show some toy examples and discuss method’s applicability and the connections to other MCMC methods.
- Georgios Vasdekis (Warwick) - "A generalisation of the ZigZag process"
- Abstract: Piecewise Deterministic Markov Processes have recently drawn the attention of MCMC community. The first reason for that is that one can simulate exactly the entire path of such a process. The second is that these processes are non-reversible, which sometimes leads to quicker mixing. In 2016, in Bierkens and Roberts, one of these Processes, the ZigZag process was introduced. This process moves linearly in space in specific directions for a random period of time and then it changes direction. However, the original directions were only allowed to be of the form {-1,+1}^d. In this talk I will explain how one can extend this process to more directions. I will give some examples where such a generalisation could be useful, but I will, also, give some heuristic reasons why one should not introduce too many directions. I will, then, present results involving the ergodicity of this process, which extends to geometric ergodicity when the target distribution has light tails. Time permitting, I will sketch why one cannot have geometric ergodicity in the case of heavy tail target.
- Jure Vogrinc (Warwick) - “Skipping MCMC: A new approach to sampling tail events”
- Week 4 - 1st February - Mateusz Majka (Warwick) - "Sampling from invariant measures of SDEs: Non-asymptotic analysis via coupling"
- Abstract: We consider the problem of approximate sampling from invariant measures of Langevin stochastic differential equations. We first discuss a general method of analysing convergence of Euler discretisations of such equations in the Wasserstein distances, based on the coupling technique. In particular, we develop a way to control the distance between an Euler scheme and its perturbed version. We show how to apply our perturbation result to study the Unadjusted Langevin Algorithm (ULA) for approximate sampling from non-log-concave measures. We also consider its counterpart based on the Stochastic Gradient Langevin Dynamics (SGLD). Finally, we discuss how our techniques apply to the Multi-level Monte Carlo (MLMC) method for Euler schemes. The talk is based on joint work with Aleksandar Mijatovic (Warwick) and Lukasz Szpruch (Edinburgh).
- Week 5 - 8th February - Ioannis Kosmidis (Warwick) - "Towards the finite-sample variance bounds for unbiased estimators"
- Abstract: The inverse of the Fisher information matrix in a likelihood problem is i) the variance-covariance matrix of the asymptotic distribution of the maximum likelihood (ML) estimator ii) the dominant term in the expansion of the finite-sample variance of the ML estimator, and iii) the "lowest" achievable variance-covariance that an unbiased estimator can achieve, where "lowest" here indicates that its difference from the variance of any unbiased estimator is a positive definite matrix. These three characterizations and the asymptotic unbiasedness of the ML estimator are key justifications for the wide-spread use of the latter in statistical practice. For example, standard regression software typically reports the ML estimates alongside with estimated standard errors coming from the inversion of the Fisher information matrix at the estimates. Nevertheless, the use of that pair of estimates and estimated standard errors for inference implicitly assumes, amongst other things, that the information about the parameters in the sample is large enough for the estimator to be almost unbiased and its variance to be well-approximated by the inverse of the Fisher information matrix. In this talk, we present results from work-in-progress that aims to bridge that finite-sample gap between estimates and the estimated variance-covariance matrix. Specifically, we introduce a novel family of estimators that not only have the same limiting optimality properties as the ML estimator (consistency and asymptotic normality, unbiasedness and efficiency), but also have finite sample variance that is asymptotically closer to the inverse of the Fisher information matrix than the variance of the ML estimator is. We illustrate the properties of the proposed estimators in some well-used inferential settings.
- Week 6 - 15th February - Matthew Ludkin (Lancaster) - Title:"Hug 'N' Hop: Explicit, non-reversible, contour-hugging MCMC"
- Abstract: Both the Bouncy Particle Sampler (BPS) and the Discrete Bouncy Particle Sampler (DBPS) are non-reversible Markov chain Monte Carlo algorithms whose action can be visualised in terms of a particle moving with a fixed-magnitude velocity. Both algorithms include an occasional step where the particle `bounces' off a hyperplane which is tangent to the gradient of the target density, making the BPS rejection-free and allowing the DBPS to propose relatively large jumps whilst maintaining a high acceptance rate. Analogously to the concatenation of leapfrog steps in HMC, we describe an algorithm which omits the straight-line movement
of the BPS and DBPS and, instead, at each iteration concatenates several discrete `bounces' to provide a proposal which is on almost the same target contour as the starting point, producing a large proposed move
with a high acceptance probability. Combined with a separate kernel designed for moving between contours, an explicit bouncing scheme which takes account of the local Hessian at each bounce point ensures that the
proposal respects the local geometry of the target, and leads to an efficient, skew-reversible MCMC algorithm.
- Abstract: Both the Bouncy Particle Sampler (BPS) and the Discrete Bouncy Particle Sampler (DBPS) are non-reversible Markov chain Monte Carlo algorithms whose action can be visualised in terms of a particle moving with a fixed-magnitude velocity. Both algorithms include an occasional step where the particle `bounces' off a hyperplane which is tangent to the gradient of the target density, making the BPS rejection-free and allowing the DBPS to propose relatively large jumps whilst maintaining a high acceptance rate. Analogously to the concatenation of leapfrog steps in HMC, we describe an algorithm which omits the straight-line movement
- Week 7 - 22nd February - Patrick Rebeschini (Oxford) - "On the Interplay between Statistics, Computation and Communication in Decentralised Learning."
- Abstract: Motivated by bandwidth limitations, privacy concerns and network instability, a large body of literature has been investigating the performance of decentralised optimisation methods to fit statistical models on datasets stored across multiple machines where there is no central server to coordinate computation. In this literature, data is typically treated as deterministic and one is interested in controlling the optimisation error. In this talk, we take a statistical point of view and assume that data come from the same unknown probability distribution. We consider the decentralised version of two of the most well-studied algorithmic paradigms for serial statistical optimisation in offline and online learning: gradient descent and upper confidence bound (UCB). For decentralised gradient descent in non-parametric regression, we investigate the choice of step-size, stopping time, and communication topology that allows to recover optimal statistical rates with respect to the entire data on the network. We show that the choice of the communication graph can be considered as a regulariser, and that more statistics allows for less computation and less communication. (based on joint work with D. Richards)
For decentralised UCB in multi-armed bandit problems, we show that network-dependent delayed actions are key to obtain improved regret bounds as a function of the graph topology. (based on joint work D. Martínez-Rubio and V. Kanade)
- Abstract: Motivated by bandwidth limitations, privacy concerns and network instability, a large body of literature has been investigating the performance of decentralised optimisation methods to fit statistical models on datasets stored across multiple machines where there is no central server to coordinate computation. In this literature, data is typically treated as deterministic and one is interested in controlling the optimisation error. In this talk, we take a statistical point of view and assume that data come from the same unknown probability distribution. We consider the decentralised version of two of the most well-studied algorithmic paradigms for serial statistical optimisation in offline and online learning: gradient descent and upper confidence bound (UCB). For decentralised gradient descent in non-parametric regression, we investigate the choice of step-size, stopping time, and communication topology that allows to recover optimal statistical rates with respect to the entire data on the network. We show that the choice of the communication graph can be considered as a regulariser, and that more statistics allows for less computation and less communication. (based on joint work with D. Richards)
- Week 8 - 1st March - Susana Gomes (Warwick) - "Parameter estimation for pedestrian dynamics models"
- Abstract: In this talk we present a framework for estimating parameters in macroscopic models for crowd dynamics using data from individual trajectories. We consider a model for the unidirectional flow of pedestrians in a corridor which consists of a coupling between a density dependent stochastic differential equation and a nonlinear partial differential equation for the density. In the stochastic differential equation for the trajectories, the velocity of a pedestrian decreases with the density according to the fundamental diagram. Although there is a general agreement on the basic shape of this dependence, its parametrization depends strongly on the measurement and averaging techniques used as well as the experimental setup considered. We will discuss identifiability of the parameters appearing in the fundamental diagram, introduce optimisation and Bayesian methods to perform the identification, and analyse the performance of the proposed methodology in various realistic situations. Finally, we discuss possible generalisations, including the effect of the form of the fundamental diagram and the use of experimental data.
- Abstract: In this talk we present a framework for estimating parameters in macroscopic models for crowd dynamics using data from individual trajectories. We consider a model for the unidirectional flow of pedestrians in a corridor which consists of a coupling between a density dependent stochastic differential equation and a nonlinear partial differential equation for the density. In the stochastic differential equation for the trajectories, the velocity of a pedestrian decreases with the density according to the fundamental diagram. Although there is a general agreement on the basic shape of this dependence, its parametrization depends strongly on the measurement and averaging techniques used as well as the experimental setup considered. We will discuss identifiability of the parameters appearing in the fundamental diagram, introduce optimisation and Bayesian methods to perform the identification, and analyse the performance of the proposed methodology in various realistic situations. Finally, we discuss possible generalisations, including the effect of the form of the fundamental diagram and the use of experimental data.
- Week 9 - 8th March - Joint Session (Short Talks)
- Reece Mears (Warwick) - "Cracking ciphers using simulation"
- Abstract: The need to protect secrets via encryption has existed for millennia, and for just as long the desire to intercept and decode such secrets has been its counterpart. Traditionally, cryptanalysis has been a tedious, manual process, but with recent advances in computing, new methods are being developed that require little human input. This talk will assess Markov chain Monte Carlo in its ability to decode simple substitution and transposition ciphers. A new implementation for transposition ciphers is proposed, and is found to outperform the existing algorithm by all accounts. We then take a fresh approach into more complicated ciphers, a mostly unexplored area for MCMC algorithms, including one of the most notorious polyalphabetic ciphers ever conceived: the Vigenère cipher. In each case, the algorithms developed are found to be successful in decoding the ciphertexts to a human-readable standard, all the while requiring far less parameter tuning than other existing approaches. This adaptability may be the key for attacking ciphers given less knowledge of the encryption technique a priori.
- Ryan Chan (Warwick) - "Hierarchical Monte Carlo Fusion"
- Abstract: Monte Carlo Fusion proposes a new theory and methodology to tackle the problem of unifying distributed analyses and inferences on shared parameters from multiple sources, into a single coherent inference. This problem can appear in settings such as expert elicitation, distributed ‘big data’ problems, and tempering. ‘Hierarchical Monte Carlo Fusion’ builds upon the Monte Carlo Fusion algorithm (Dai, Pollock & Roberts 2018) which uses a ‘fork-and-join’ (or ‘divide-and-conquer’) approach, and can be useful when the number of sub-posteriors to be unified is large. We use a tempering example to illustrate the extension.
- Reece Mears (Warwick) - "Cracking ciphers using simulation"
- Week 10 - 15th March - Jake Carson (Warwick) - "Bayesian Model Selection for Infectious Disease Models"
- Abstract: Infectious disease models typically contain few model parameters, but inference for these models is challenging owing to large amounts of missing information, such as the infection and recovery times of individuals. When the full conditional distribution of the hidden infection process is available, such as in discrete time epidemic models, effective approaches exist for estimating the model evidence to a high precision. These approaches make use of the forward filtering backward sampling (FFBS) algorithm to impute the hidden infection process. Since the computational cost of the FFBS algorithm grows exponentially with the number of individuals, these approaches are only tractable when analysing small populations. Recently proposed variants of the FFBS algorithm reduce the computational complexity of imputing the hidden process to linear in the number of individuals, but do not directly sample the hidden infection process from its full conditional distribution. We demonstrate how these developments can be used to form effective proposal distributions for estimation of the model evidence when studying larger populations.
2018/19 Term 3:
- Week 1 - 26th April - Toni Karvonen (Aalto University) - "Classical approximation and Gaussian process regression"
- Abstract: This talk discusses construction of different Gaussian process models whose associated posterior means coincide with "classical" interpolation or quadrature methods such as polynomial or spline interpolants or Gaussian or Monte Carlo quadratures. This is motivated by a desire to endow these well-tested and robust methods with meaningful statistical measures of uncertainty. We present three approaches for recovering polynomial interpolation and quadrature; (i) polynomial covariance kernels, (ii) inclusion of a parametric prior mean with coefficients that are given improper priors, and (iii) increasingly flat stationary covariance kernels. Approaches (i) and (iii) are not useful for uncertainty quantification, giving rise to degenerate posteriors. Approach (ii) is flexible and general but its posteriors are, to some extent, arbitrary. Finally, we also review some results on the relationship of spline interpolation and Gaussian process regression with integrated Brownian motion kernels.
- Abstract: This talk discusses construction of different Gaussian process models whose associated posterior means coincide with "classical" interpolation or quadrature methods such as polynomial or spline interpolants or Gaussian or Monte Carlo quadratures. This is motivated by a desire to endow these well-tested and robust methods with meaningful statistical measures of uncertainty. We present three approaches for recovering polynomial interpolation and quadrature; (i) polynomial covariance kernels, (ii) inclusion of a parametric prior mean with coefficients that are given improper priors, and (iii) increasingly flat stationary covariance kernels. Approaches (i) and (iii) are not useful for uncertainty quantification, giving rise to degenerate posteriors. Approach (ii) is flexible and general but its posteriors are, to some extent, arbitrary. Finally, we also review some results on the relationship of spline interpolation and Gaussian process regression with integrated Brownian motion kernels.
- Week 2 - 3rd May - Leah South (Lancaster) - "Regularised Zero-Variance Control Variates"
- Abstract: Zero-variance control variates (ZV-CV) are a post-processing method to reduce the variance of Monte Carlo estimators of expectations using the derivatives of the log target. Once the derivatives are available, the only additional computational effort is solving a linear regression problem. Significant variance reductions have been achieved with this method in low dimensional examples, but the number of covariates in the regression rapidly increases with the dimension of the target. We propose to exploit different types of regularisation to make the method more flexible and feasible, particularly in higher dimensions. Our novel methods retain the unbiasedness property of the estimators. The benefits of regularised ZV-CV for Bayesian inference will be illustrated using several examples, including a 61-dimensional example. This is work joint with Antonietta Mira, Chris Oates and Chris Drovandi.
- Abstract: Zero-variance control variates (ZV-CV) are a post-processing method to reduce the variance of Monte Carlo estimators of expectations using the derivatives of the log target. Once the derivatives are available, the only additional computational effort is solving a linear regression problem. Significant variance reductions have been achieved with this method in low dimensional examples, but the number of covariates in the regression rapidly increases with the dimension of the target. We propose to exploit different types of regularisation to make the method more flexible and feasible, particularly in higher dimensions. Our novel methods retain the unbiasedness property of the estimators. The benefits of regularised ZV-CV for Bayesian inference will be illustrated using several examples, including a 61-dimensional example. This is work joint with Antonietta Mira, Chris Oates and Chris Drovandi.
- Week 3 - 10th May - Short Talks
- Dootika Vats (Warwick) - "Revisiting the Gelman-Rubin Diagnostic"
- Abstract: Gelman and Rubin's (1992) convergence diagnostic is one of the most popular methods for terminating a Markov chain Monte Carlo (MCMC) sampler. Since the seminal paper, researchers have developed sophisticated methods of variance estimation for Monte Carlo averages. We show that this class of estimators find immediate use in the Gelman-Rubin statistic, a connection not established in the literature before. We incorporate these estimators to upgrade both the univariate and multivariate Gelman-Rubin statistics, leading to increased stability in MCMC termination time. An immediate advantage is that our new Gelman-Rubin statistic can be calculated for a single chain. In addition, we establish a relationship between the Gelman-Rubin statistic and effective sample size. Leveraging this relationship, we develop a principled cutoff criterion for the Gelman-Rubin statistic. Finally, we demonstrate the utility of our improved diagnostic via an example. This work is joint with Christina Knudson, University of St. Thomas, Minnesota.
- Omer Deniz Akyildiz (Warwick) "Probabilistic Incremental Optimisation"
- Abstract: In this talk, I will introduce a connection between incremental optimization and Bayesian filtering algorithms. After mentioning how reinterpretation of optimization methods as inference algorithms enables us to develop more advanced and uncertainty aware optimizers, I will summarize some applications of this connection for developing novel probabilistic optimizers and show some experimental results.
- Dootika Vats (Warwick) - "Revisiting the Gelman-Rubin Diagnostic"
- Week 4 - 17th May - Jonathan Harrison (Warwick) - "Combining summary statistics in approximate Bayesian computation"
-
Abstract: To infer the parameters of mechanistic models with intractable likelihoods, techniques such as approximate Bayesian computation (ABC) are increasingly being adopted. One of the main disadvantages of ABC in practical situations, however, is that parameter inference must generally rely on summary statistics of the data. This is particularly the case for problems involving high-dimensional data, such as biological imaging experiments. However, some summary statistics contain more information about parameters of interest than others, and it is not always clear how to weight their contributions within the ABC framework. We address this problem by developing an automatic, adaptive algorithm that chooses weights for each summary statistic. Our algorithm aims to maximize the distance between the prior and the approximate posterior by automatically adapting the weights within the ABC distance function. We justify the algorithm theoretically and demonstrate the effectiveness of our algorithm by applying it to a variety of test problems, including several stochastic models of biochemical reaction networks.
-
- Week 5 - 24th May - Jeremie Houssineau (Warwick) - "Bayesian inference for dynamical systems with outliers and detection failures"
-
Abstract: When data is collected, it most often happens that some outliers might be collected instead of or along with the data point of interest at each observation time. Although the Bayesian philosophy encourages to encompass all sources of uncertainty within one single inference framework, outliers and other undesirable effects are commonly removed through pre-processing in order to deliver a 'cleaned' version of the data for which a principled Bayesian approach can be used. In spite of being a known source of error, pre-processing is generally accepted since it is challenging to integrate these outliers in a Bayesian inference algorithm as they can be difficult to characterise from a statistical viewpoint. In this talk, I will suggest to use an alternative representation of uncertainty, compatible with the Bayesian philosophy, and allowing for describing phenomena which statistical properties are partially or fully unknown.
-
- Week 6 - 31st May - Jorge I. González Cázares (Warwick) - "Geometrically convergent simulation of the extrema of Lévy processes"
-
Abstract: We develop a novel Monte Carlo algorithm for the simulation from the joint law of the position, the running supremum and the time of the supremum of a general Lévy process at an arbitrary finite time. We prove that the bias decays geometrically, in contrast to the power law for the random walk approximation (RWA). We identify the law of the error and, inspired by the recent work of Ivanovs on RWA, characterise its asymptotic behaviour. If the increments of the Lévy process cannot be sampled directly, we combine our algorithm with the Asmussen-Rosiński Gaussian approximation of small jumps by choosing the rate of decay of the cutoff level for small jumps so that the corresponding MC and MLMC estimators have minimal computational complexity.
-
- Week 7 - 7th June - Sam Power (Cambridge) - "Piecewise-Deterministic Markov Processes with General ODE Dynamics"
- Abstract: Piecewise-Deterministic Markov Processes (PDMPs) have attracted attention in recent years as a non-reversible alternative to traditional reversible MCMC methods. By using a combination of deterministic dynamics and jump processes, these methods are often able to suppress random-walk behaviour and reach equilibrium rapidly.
Although the PDMP framework accommodates a wide range of underlying dynamics in principle, existing approaches have tended to use quite simple dynamics, such as straight lines and elliptical orbits. In this work, I present a procedure which allows one to use a general dynamical system in the PDMP framework to sample from a given measure. The procedure makes use of `trajectorial reversibility’, a generalisation of `detailed balance’ which allows for tractable computation with otherwise non-reversible processes. Correctness of the procedure is established in a general setting, and specific, constructive recommendations are made for how to implement the resulting algorithms in practice.
This is joint work with Sergio Bacallado.
- Abstract: Piecewise-Deterministic Markov Processes (PDMPs) have attracted attention in recent years as a non-reversible alternative to traditional reversible MCMC methods. By using a combination of deterministic dynamics and jump processes, these methods are often able to suppress random-walk behaviour and reach equilibrium rapidly.
- Week 8 - 14th June - Daniel Paulin (Oxford) - "Connections between optimization and sampling"
-
Abstract: In this talk, I am going to look at some connections between optimization and sampling.
In "Hamiltonian descent methods", we introduce a new optimization method based on conformal Hamiltonian dynamics. It was inspired by the literature on Hamiltonian MCMC methods. The use of general kinetic energies allows us to obtain linear rates of convergence for a much larger class than strongly convex and smooth functions.
"Dual Space Preconditioning for Gradient Descent'' applies a similar idea to gradient descent. We introduce a new optimization method based on nonlinear preconditioning of gradient descent, with simple and transparent conditions for convergence.
"Randomized Hamiltonian Monte Carlo as Scaling Limit of the Bouncy Particle Sampler and Dimension-Free Convergence Rates" studies the high dimensional behaviour of the Bouncy Particle Sampler (BPS), a non-reversible piecewise deterministic MCMC method. Although the paths of this method are straight lines, we show that in high dimensions they converge to a Randomised Hamiltonian Monte Carlo (RHMC) process, whose paths are determined by the Hamiltonian dynamics. We also give a characterization of the mixing rate of the RHMC process for log-concave target distributions that can be used to tune the parameters of BPS.
-
- Week 9 - 21st June - Short Talks
- Daniel Tait (Warwick) - "Physics Informed Models of Dynamical Systems using Gaussian Processes"
- The successful modelling of a dynamical system using Gaussian processes requires more than just a kernel with a temporal argument. In this talk we introduce the class of latent force models, a class of dynamical systems driven by latent Gaussian processes, and describe how these models allow for the embedding of relevant properties of dynamical systems into flexible data driven approaches. We then discuss the harder problem of incorporating these flexible models into more complex mechanistic descriptions as a means of representing model uncertainty in physically informed models for machine learning.
- Kangrui Wang (Warwick) - "Bayesian supervised learning for high dimensional data"
- Abstract: This talk is all about my PhD thesis for Bayesian supervised learning. I focus on the development of methodologies that help undertake learning of functional relationships between variables, given high-dimensional observations. The probabilistic learning of the functional relation between these variables is done by modelling this function with a high-dimensional Gaussian Process (GP), and the likelihood is then parametrised by multiple covariance matrices. These covariance matrices can also be used to implement a graphical model. We develop a MCMC scheme to learn the time inhomogeneous random graph from the data. Therefore, the uncertainty of the data can be transferred into the random graph.
- Daniel Tait (Warwick) - "Physics Informed Models of Dynamical Systems using Gaussian Processes"
- Week 10 - 28th June - Cancelled