Algorithms & Computationally Intensive Inference seminars
The seminars will take place on Fridays 1 pm UK time in room MB0.08.
2024-2025 Organisers: Wenkai Xu & Filippo Pagani
If you would like to speak, or you want to be included in any emails, please contact one of the organisers.
Website URL: www.warwick.ac.uk/compstat
Mailing List Sign-Up: http://mailman1.csv.warwick.ac.uk/mailman/listinfo/algorithmseminar
Mailing List: algorithmseminar@listserv.csv.warwick.ac.uk (NB - only approved members can post)
Term 1:
Date | Speaker | Title |
29/11 | Lukas Trottner (Birmingham) |
Learning to reflect – On data driven approaches to stochastic optimal control
|
Note: Hybrid |
Abstract: Even though theoretical solutions to stochastic optimal control problems are well understood in many scenarios, their practicability suffers from the assumption of known dynamics of the underlying stochastic process. This raises the statistical challenge of developing purely data-driven strategies. In this talk we focus on singular control problems for diffusions and demonstrate how such data-driven strategies with explicit sublinear regret bounds can be constructed by employing nonparametric statistical techniques. |
|
22/11 | Anna Shalova (Eindhoven) |
Singular-limit analysis of gradient descent with noise injection
|
Note: Hybrid |
Abstract: We study the limiting dynamics of noisy gradient descent systems in the overparameterized regime. In this regime the set of global minimizers of the loss is large, and when initialized in a neighbourhood of this zero-loss set a noisy gradient descent algorithm slowly evolves along this set. In some cases this slow evolution has been related to better generalisation properties. We give an explicit characterization of this evolution for the broad class of noisy gradient descent systems. Our results show that the structure of the noise affects not just the form of the limiting process, but also the time scale at which the evolution takes place. We apply our theory to Dropout, label noise and classical SGD (minibatching) noise. We show that dropout and label noise models evolve on two different time scales. At the same time classical SGD yields a trivial evolution on both mentioned time scales, implying that an additional noise is required for regularization. This is a join work with Mark Peletier and André Schlichting. |
|
15/11 | Saifuddin Syed (Oxford) |
Scalable sampling using annealed algorithms
|
Note: Hybrid |
Abstract: Generating samples from complex probability distributions is a fundamental challenge in statistical modelling and Bayesian statistics. In practice, this is generally impossible, and we must introduce a simpler reference distribution, such as a Gaussian, and manipulate its density and samples to approximate the target. In general, direct inference is reliable when the reference is close to the target and fragile when it is not. Annealing is a popular technique motivated by this principle and introduces a sequence of distributions that interpolates between the reference and target, ensuring the neighbouring distributions are close enough. An annealing algorithm specifies how to traverse this bridge of distributions to incrementally transform samples from the reference into samples approximating the target. In this talk, we will construct two computationally dual annealing algorithms called Sequential Monte Carlo Samplers (SMC) and Parallel Tempering (PT), which propagate samples from the reference to the target using importance sampling and Metropolis-Hasting, respectively. By analysing the variance of the normalising constant estimator, we will see how the performance scales with increasing runtime, parallelism, memory, and the difficulty of the inference problem. Notable, we will identify a critical phenomenon and explain why these algorithms are efficient and can scale to tackle modern sampling problems. Finally, we will provide a black-box algorithm to tune these algorithms efficiently and practical guidelines for when to implement SMC versus PT. |
|
8/11 | Antonin Schrab (UCL) |
Optimal Kernel Hypothesis Testing
|
Note: Hybrid |
Abstract: The thesis consists in proposing new kernel hypothesis tests and proving optimal power guarantees for them. Various testing frameworks such as the two-sample, independence and goodness-of-fit frameworks are considered. A strong focus is put on the, often ignored, crucial choice of the kernel which strongly impacts the test power. Two methods, namely kernel pooling and aggregation, are proposed to adaptively select the kernels in a parameter-free manner, and are shown to lead to minimax optimal separation rates with respect to the kernel and L2 metrics. Optimal kernel tests are also developed, and their power guarantees theoretically analysed, under various testing constraints such as, computation efficiency, differential privacy, and robustness to data corruption. |
|
1/11 | Alice Corbella (Warwick) |
The Lifebelt Particle Filter: a novel robust SMC scheme
|
Note: Hybrid |
Abstract: Sequential Monte Carlo (SMC) methods can be applied to discrete State-Space Models on bounded domains, to sample from and marginalise over unknown random variables. Similarly to continuous settings, problems such as particle degradation can arise: proposed particles can be incompatible with the data, lying in low probability regions or outside the boundary constraints, and the discrete system could result in all particles having weights of zero. In this talk I will introduce the Lifebelt Particle Filter (LBPF), a novel SMC method for robust likelihood estimation in low-valued count problems. The LBPF combines a standard particle filter with one (or more) lifebelt particles which, by construction, lie within the boundaries of the discrete random variables, and therefore are compatible with the data. The main benefit of the LBPF is that only one or few, wisely chosen, particles are sufficient to prevent particle collapse. The LBPF can be used within a pseudo-marginal scheme to draw inference on static parameters, θ, governing the system. In the talk I will also present an example of the use of the LBPF for the estimation of the parameters governing the death and recovery process of hospitalised patients during an epidemic. Ref: Corbella, A., McKinley, T.J., Birrell, P.J., Presanis, A.M., Roberts, G.O. and De Angelis, D., and Spencer S.E. 2022 The Lifebelt Particle Filter for robust estimation from low-valued count data. arXiv preprint arXiv:2212.04400. |
|
25/10 | Heishiro Kanagawa (Newcastle) |
Reinforcement Learning for Adaptive MCMC
|
Note: Hybrid |
Abstract: An informal observation, made by several authors, is that the adaptive design of a Markov transition kernel has the flavour of a reinforcement learning task. Yet, to-date it has remained unclear how to actually exploit modern reinforcement learning technologies for adaptive MCMC. The aim of this work is to set out a general framework, called Reinforcement Learning Metropolis--Hastings, that is theoretically supported and empirically validated. Our principal focus is on learning fast-mixing Metropolis--Hastings transition kernels, which we cast as deterministic policies and optimise via a policy gradient. Control of the learning rate provably ensures conditions for ergodicity are satisfied. The methodology is used to construct a gradient-free sampler that out-performs a popular gradient-free adaptive Metropolis--Hastings algorithm on ≈90% of tasks in the PosteriorDB benchmark. |
|
18/10 | Anastasia Mantziou (Warwick) |
Bayesian model-based clustering for populations of network data
|
Note: Hybrid |
Abstract: There is increasing appetite for analysing populations of network data due to the fast-growing body of applications demanding such methods. While methods exist to provide readily interpretable summaries of heterogeneous network populations, these are often descriptive or ad hoc, lacking any formal justification. In contrast, principled analysis methods often provide results difficult to relate back to the applied problem of interest. Motivated by two complementary applied examples, we develop a Bayesian framework to appropriately model complex heterogeneous network populations, while also allowing analysts to gain insights from the data and make inferences most relevant to their needs. The first application involves a study in computer science measuring human movements across a university. The second analyses data from neuroscience investigating relationships between different regions of the brain. While both applications entail analysis of a heterogeneous population of networks, network sizes vary considerably. We focus on the problem of clustering the elements of a network population, where each cluster is characterised by a network representative. We take advantage of the Bayesian machinery to simultaneously infer the cluster membership, the representatives, and the community structure of the representatives, thus allowing intuitive inferences to be made. The implementation of our method on the human movement study reveals interesting movement patterns of individuals in clusters, readily characterised by their network representative. For the brain networks application, our model reveals a cluster of individuals with different network properties of particular interest in neuroscience. The performance of our method is additionally validated in extensive simulation studies. |
|
11/10 | Luke Hardcastle (UCL) |
Piecewise Deterministic Markov Processes for transdimensional sampling from flexible Bayesian survival models
|
Note: Hybrid Slides |
Abstract: Flexible survival models have seen increasing popularity for the estimation of mean survival in the presence of a high degree of administrative censoring where survival curves need to be extrapolated beyond final observed event times. This increased flexibility, however, often introduces challenging model selection problems that have limited their wider application. In this talk I will focus on two such models, the polyhazard model and the piecewise exponential model. We introduce new prior structures that allow for the joint inference of parameters and structural quantities. Posterior sampling is achieved using bespoke MCMC schemes based on Piecewise Deterministic Markov Processes that utilise and extend existing methods for these samplers to target transdimensional posterior distributions. This is a joint work with Samuel Livingstone and Gianluca Baio. |