Programme CRiSM 2.0 - Conference 2025
All talks will be held in rooms in the Zeeman BuildingLink opens in a new window (Mathematics) and MSB BuildingLink opens in a new window (Statistics). The Registration and coffee breaks on Wednesday will be held in "The Street", in front of Zeeman room MS.01. All lunches, poster session, wine and food reception will be held in "The Atrium", the main foyer in the MSB BuildingLink opens in a new window. A schedule of the main conference is given below.
**Update post-conference: slides which have been kindly shared can be found linked from the talk titles below.**
Schedule
Day | Time | Activity/ Speakers | Location (all Sessions in Zeeman) |
---|---|---|---|
Wednesday 21 May | 14:30 - 15:00 | Registration and Coffee | The Street (Zeeman) |
15:00 - 16:00 | Keynote Lecture Regina Liu: Fusion Learning: Combining inferences from diverse data sources |
MS.01 | |
16:00 - 16:45 | Parallel Session A (Statistical theory & application) Parallel Session B (Probability) |
MS.01
|
|
16:45 - 17:30 | Parallel Session C (Computational statistics & machine learning) Parallel Session D (Mathematical finance) |
MS.01
|
|
18:00 - 21:00 | Wine Reception and Posters | The Atrium (MSB) | |
Thursday 22 May | 09:00 - 10:00 | Keynote Lecture Jan Obłój: X-OT: on variants of optimal transport problem and understanding model robustness |
MS.01 |
10:00 - 10:45 | Parallel Session D (Mathematical finance) Parallel Session A (Statistical theory & application) |
MS.01
|
|
10:45 - 11:15 | Coffee Break | The Street (Zeeman) | |
11:15 - 12:00 | Parallel Session B (Probability) Parallel Session C (Computational statistics & machine learning) |
MS.01 MS.04 |
|
12:00 - 12:45 | Parallel Session D (Mathematical finance) Parallel Session A (Statistical theory & application) |
MS.01
|
|
12:45 - 14:15 | Lunch | The Atrium (MSB) | |
14:15 - 15:15 | Keynote Lecture George Deligiannidis: Theory for denoising diffusion models and the manifold hypothesis |
MS.01 | |
15:15 - 16:00 | Parallel Session C (Computational Statistics & machine learning) Parallel Session B (Probability) |
MS.01 MS.04 |
|
16:00 - 16:20 | Coffee Break | The Street (Zeeman) | |
16:20 - 17:05 | Parallel Session A (Statistical theory & application) Parallel Session D (Mathematical finance) |
MS.01 MS.04 |
|
19:00 | Dinner by invitation only | Scarman | |
Friday 23 May | 09:00 - 10:00 | Keynote Lecture Philip Ernst: Yule's "nonsense correlation": Moments and density |
MS.01 |
10:00 - 10:45 | Parallel Session B (Probability) Parallel Session C (Computational Statistics & machine learning) |
MS.01 MS.04 |
|
10:45 - 11:15 | Coffee Break | The Street (Zeeman) | |
11:15 - 12:00 | Parallel Session D (Mathematical finance) Parallel Session C (Computational statistics & machine learning) |
MS.01 MS.04 |
|
12:00 - 12:45 | Parallel Session B (Probability) Parallel Session A (Statistical theory & application) |
MS.01 MS.04 |
|
12:45 - 13:00 | Closing Remarks with a light lunch after in the foyer 'The Street' | MS.01 |
Speakers, titles, and abstracts
Regina LiuLink opens in a new window (Keynote Speaker) (Rutgers University) - Fusion Learning: combining inferences from diverse data sources
Advanced data acquisition technology nowadays has often made inferences from diverse data sources easily accessible. Fusion learning refers to combining inferences from multiple sources or studies to make a more effective overall inference. We focus on the tasks: 1) Whether/When to combine inferences? 2) How to combine inferences efficiently? 3) How to combine inference to enhance an individual study, thus named i-Fusion?
We present a general framework for nonparametric and efficient fusion learning for inference on multi-parameters, which may be correlated. The main tool underlying this framework is the new notion of depth confidence distribution (depth-CD), which is developed by combining data depth, bootstrap and confidence distributions. We show that a depth-CD is an omnibus form of confidence regions, whose contours of level sets shrink toward the true parameter value, and thus an all-encompassing inferential tool. The approach is shown to be efficient, general and robust. It readily applies to heterogeneous studies with a broad range of complex and irregular settings. This property also enables the approach to utilize indirect evidence from incomplete studies to gain efficiency for the overall inference. The approach is demonstrated with simulation studies and real applications in tracking aircraft landing performance and in zero-event studies in clinical trials.
This is joint work with Dungang Liu (U. Cincinnati, Lindner College of Business) and Minge Xie (Rutgers University).
John AstonLink opens in a new window (University of Cambridge) - Using geometry in non-parametric statistics
In non-parametric regression, the rates of estimation critically depend on the dimension, usually known as the curse of dimensionality. However, it has been long known that incorporating structure into the regression (such as sparsity) can improve on general rates. However, sparsity and most related concepts are linear in the data, while many patterns in regressions are non-linear in nature, and crucially depend on all the covariates. In this talk, we will consider how more general structure can be incorporated in non-parametric regression through the use of symmetries. General notions of symmetry corresponding to algebraic group structures can exhibit the similar dimension reduction phenomena, even in non-linear settings, and even in the case where the symmetries need to be estimated as part of the regression. We will also show that, by considering lattice structures, efficient computational estimation schemes to determine such symmetries are possible. This is joint work with Louis Christie (Cambridge).
Ioannis KosmidisLink opens in a new window (University of Warwick) - Penalized likelihood estimation and inference in high-dimensional logistic regression
In recent years, there has been a surge of interest in estimators and inferential procedures that exhibit optimal asymptotic properties in high-dimensional logistic regression when the number of covariates grows proportionally as a fraction ($\kappa \in (0,1)$) of the number of observations. In this seminar, we focus on the behaviour of a class of maximum penalized likelihood estimators, employing the Diaconis-Ylvisaker prior as the penalty.
Building on advancements in approximate message passing, we analyze the aggregate asymptotic behaviour of these estimators when covariates are normal random variables with arbitrary covariance. This analysis enables us to eliminate the persistent asymptotic bias of the estimators through straightforward rescaling for any value of the prior hypertuning parameter. Moreover, we derive asymptotic pivots for constructing inferences, including adjusted Z-statistics and penalized likelihood ratio statistics.
Unlike the maximum likelihood estimate, which only asymptotically exists in a limited region on the plane of $\kappa$ versus signal strength, the maximum penalized likelihood estimate always exists and is directly computable via maximum likelihood routines. As a result, our asymptotic results remain valid even in regions where existing maximum likelihood results are not obtainable, with no overhead in implementation or computation.
The dependency of the estimators on the prior hyper-parameter facilitates the derivation of estimators with zero asymptotic bias and minimal mean squared error. We will explore these estimators' shrinkage properties, substantiate our theoretical findings with simulations and applications, and present evidence for conjectures with different penalties, such as Jeffreys' prior and non-normal covariate distributions.
Collaborators: Philipp Sterzinger and Patrick Zietkiewicz
Preprints: https://arxiv.org/abs/2311.07419, https://arxiv.org/abs/2311.11290
Helen OgdenLink opens in a new window (University of Southampton) - Flexible models for simple longitudinal data
I will describe a new method for modelling simple longitudinal data. We aim to do this in a flexible manner (without restrictive assumptions about the shapes of individual trajectories), while exploiting structural similarities between the trajectories. Hierarchical models (such as linear mixed models, generalised additive mixed models and hierarchical generalised additive models) are commonly used to model longitudinal data, but fail to meet one or other of these requirements: either they make restrictive assumptions about the shape of individual trajectories, or fail to exploit structural similarities between trajectories. Functional principal components analysis promises to fulfil both requirements, and methods for functional principal components analysis have been developed for longitudinal data. However, we find that existing methods sometimes give poor-quality estimates, particularly when the number of observations on each individual is small.
In our new approach, hierarchical modelling with functional principal components, inference is conducted based on the full likelihood of all unknowns, with a penalty term to control the balance between fit to the data and smoothness of the trajectories. I will present simulation studies to demonstrate that the new method substantially improves the quality of inference relative to existing methods across a range of examples, and apply the method to data on changes in body composition in adolescent girls.
Judith RousseauLink opens in a new window (University of Oxford/Université Paris Dauphine) - Bayesian estimation in high dimensional Hawkes processes
Multivariate Hawkes processes form a class of point processes describing self and inter exciting/inhibiting processes. There is now a renewed interest in such processes in applied domains and in machine learning, but there exists only limited theory about inference in such models, in particular in high dimensions.
To be more precise, the intensity function of a linear Hawkes process has the following form: for each dimension $k \leq K$
\[ \lambda^k(t) = \sum_{\ell \leq K} \int_0^{t^-} h_{\ell k}(t - s) dN_s^\ell + \nu_k, \quad t \in [0, T] \]
where $(N^\ell, \ell \leq K)$ is the Hawkes process and $\nu_k > 0$. There have been some recent theoretical results on Bayesian estimation in the context of linear and nonlinear multivariate Hawkes processes, but these results assumed that the dimension K was fixed. Convergence rates were studied assuming that the observation window T goes to infinity.
In this work, we consider the case where K is allowed to go to infinity with T. We consider generic conditions to obtain posterior convergence rates, and we derive, under sparsity assumptions, convergence rates in L1 norm and consistent estimation of the graph of interactions.
This is joint work with Vincent Rivoirard and Deborah Sulem.
Amy WilsonLink opens in a new window (University of Edinburgh) - Statistics and the law: what’s the verdict?
Criminal cases can feature multiple pieces of dependent evidence and multiple possible explanations for this evidence. It can be challenging to disentangle the correlations between pieces of evidence and to understand how to form a logically consistent argument that accounts for this evidence in a way that is probabilistically sound. There have been high profile miscarriages of justice that have resulted from failures in probabilistic reasoning and interpretation.
In this talk I will show how chain event graphs can be used to construct possible storylines for displaying the time evolution of events and evidence in criminal cases. These chain event graphs can both be used to investigate possible arguments when drawing up a case and to make probabilistic assessments of the strength of evidence when prosecuting or defending. I will give two examples - a drugs on banknotes case and the case of the murder of Meredith Kercher. To finish the talk I will discuss the role that statistics and probabilistic reasoning can play in criminal cases and highlight where greater collaboration is needed between statisticians and those in the legal sector.
Poster contributions
Matthew Adeoye (University of Warwick) - Bayesian spatio-temporal modelling for infectious disease outbreak detection
The Bayesian analysis of infectious disease surveillance data from multiple locations typically involves building and fitting a spatio-temporal model of how the disease spreads in the structured population. Here we present new generally applicable methodology to perform this task. We introduce a parsimonious representation of seasonality and a biologically informed specification of the outbreak component to avoid parameter identifiability issues. We develop a computationally efficient Bayesian inference methodology for the proposed models, including techniques to detect outbreaks by computing marginal posterior probabilities at each spatial location and time point. We show that it is possible to efficiently integrate out the discrete parameters associated with outbreak states, enabling the use of dynamic Hamiltonian Monte Carlo (HMC) as a complementary alternative to a hybrid Markov chain Monte Carlo (MCMC) algorithm. Furthermore, we introduce a robust Bayesian model comparison framework based on importance sampling to approximate model evidence in high-dimensional space. The performance of our methodology is validated through systematic simulation studies, where simulated outbreaks were successfully detected, and our model comparison strategy demonstrates strong reliability. We also apply our new methodology to monthly incidence data on invasive meningococcal disease from 28 European countries. The results highlight outbreaks across multiple countries and months, with model comparison analysis showing that the new specification outperforms previous approaches.
Nicola Branchini (University of Edinburgh) - Revisiting SNIS: new methods and diagnostics
Importance sampling (IS) can often be implemented only with normalized weights, yielding the popular self-normalized IS (SNIS) estimator. Yet, proposal distributions are usually learned and evaluated using criteria designed for the unnormalized IS (UIS) estimator. We aim to present a unified perspective on our recent advances in understanding and improving SNIS. Specifically, we connect four contributions. First, we compare two frameworks for for adaptive importance sampling (AIS) tailored to SNIS. The former exploits the view of SNIS as a ratio of two UIS estimators, coupling two separate AIS samplers in an extended-space joint distribution. The latter instead proposes the first MCMC-driven AIS sampler directly targeting the (often overlooked) optimal SNIS proposal. Second, we establish a close connection between the optimal SNIS proposal and so-called subtractive mixture models (SMMs), where negative coefficients are possible - motivating the study of the properties of the first IS estimators using SMMs. Finally, we propose new Monte Carlo diagnostics specifically for SNIS. They extend existing diagnostics for numerator and denominator by incorporating their statistical dependence, drawing on different notions of tail dependence from multivariate extreme value theory.
Bowen Fang (University of Warwick) - Splitting schemes and parameter inference in univariate stochastic differential equations with Holder diffusion coefficients
Many real-world biological phenomena, such as population dynamics, neuronal activity, and ecological systems, are modeled using stochastic differential equations (SDEs) with multiplicative noise. Important one-dimensional examples include the Jacobi (Wright-Fisher) processes for genetic drift and neuronal models, as well as the broader Pearson diffusion class, the stochastic Ginzburg-Landau equation, and the stochastic Verhulst equation. However, exact simulation schemes for these models are often un-available or computationally prohibitive. In this work, we propose a novel numerical scheme for univariate SDEs with locally Lipschitz drift and Hölder continuous diffusion coefficients, and prove its mean-square convergence. Differently from other existing results based on the Lamperti transform, our approach belongs to the class of splitting schemes. Specifically, we decompose the original equation into explicitly solvable subequations, and then compose their solutions with two compositions schemes. The derived scheme yields notable properties on the numerical and inferential side. With respect to the former, it outperforms traditional stochastic Taylor expansion methods, such as Euler-Maruyama, in both order of con-vergence and property preservation. For example, it ensures boundary preservation for SDEs with constrained state spaces and improves empirical distribution convergence to invariant measures, allowing for more accurate and robust simulations. Beyond simulation, these splitting schemes admit tractable transition densities, en-abling parameter inference via pseudo-maximum likelihood estimation and Bayesian approaches, providing a practical framework for learning parameters of interest in com-plex biological systems.
Shu Huang (University of Warwick) - Inference for Diffusion Processes via Controlled Sequential Monte and splitting schemes
We introduce an inferential framework for a wide class of semi-linear stochastic differential equations (SDEs) with additive noise, with drift satisfying a global one-sided Lipschitz condition with at most polynomial growth. Recently, explicit mean-square convergent numerical splitting schemes have been shown to 1) preserve important structural properties of the SDE (e.g. hypoellipticity in every iteration step, geometric ergodicity, oscillatory dynamics); 2) give rise to explicit pseudo-likelihoods for fully observed processes. Here, under different observation regimes and SDEs (e.g. partially observed, hypoelliptic, fully observed with additive noise), we formalise the computation of the implied pseudo-likelihoods as the normalizing constant of a Feynman-Kac flow, allowing its efficient estimation by the controlled Sequential Monte Carlo algorithm. The estimated pseudo-likelihood is then used within a pseudo-likelihood-driven Markov chain Monte Carlo method to target the posterior distribution in the Bayesian framework, and numerically optimised to obtain point estimates that enjoys good asymptotic properties at relatively low computational cost in the frequentist settings. We will illustrate our method on the partially observed hypoelliptic FitzHugh-Nagumo model, whose first component, the only observed, describes the membrane voltage evolution of a single neuron. Our simulation results show that the proposed approach achieves low variance with a bias that can be corrected via bridge sampling.
David Huk (University of Warwick) - Your copula is a classifier in disguise: classification-based copula density estimation
We propose reinterpreting copula density estimation as a discriminative task. Under this novel estimation scheme, we train a classifier to distinguish samples from the joint density from those of the product of independent marginals, recovering the copula density in the process. We derive equivalences between well-known copula classes and classification problems naturally arising in our interpretation. Furthermore, we show our estimator achieves theoretical guarantees akin to maximum likelihood estimation. By identifying a connection with density ratio estimation, we benefit from the rich literature and models available for such problems. Empirically, we demonstrate the applicability of our approach by estimating copulas of real and high-dimensional datasets, outperforming competing copula estimators in density evaluation as well as sampling.
Dr Maria Sudell (University of Liverpool) - Joint or simultaneous modelling of related longitudinal and time-to-event data for a network of treatments across multiple data sources
Longitudinal data is recorded repeatedly over time, allowing trends over time to examined as well as results at a particular timepoint (examples include monthly blood pressure measurements, repeated laboratory measurements or regular mental health assessments). Time-to-event or survival data records the time until an individual experiences a clinical event of interest or withdraws from the dataset for unrelated reasons (for example time until first stroke, or time until hospital discharge). Commonly in healthcare data, related longitudinal and time-to-event data exists (for example, blood pressure measured repeatedly over time might be related in some way or predictive of time to first stroke). Ignoring this relationship could lead to misleading or biased results in analyses, so joint models that simultaneously evaluate the longitudinal and time-to-event outcomes and their relationship are useful.
In healthcare, many treatments may exist for a particular condition, for a given group of patients (a population). To make properly informed decisions about how these treatments compare to each other, we need to be able to compare them simultaneously. This can be difficult, if different data sources only involve a subset of the possible treatments (for example, if many clinical trials have been conducted for a given condition, but they each examine subsets of the possible treatments).
Network Meta Analysis (NMA) provides an approach to pool data from multiple trials, each of which compare a subset of treatments from the complete set of possible treatments for a population (as long as a connected "network" of treatments can be drawn from the available data). Approaches for separate longitudinal NMA and separate time-to-event NMA currently exist, but not for NMA involving both longitudinal and time-to-event data. This research provides novel methodology and code linking joint modelling with NMA approaches, to better allow evaluation of all available treatment options and their effects on longitudinal and time-to-event outcomes of interest.
The proposed methodology and code is applied to a multi-study cardiovascular dataset, containing repeated blood pressure measurements, and the times until various cardiovascular events (stroke, myocardial infarction, death), for a network of anti-hypertensive treatments.
Jixin Wang (Imperial College London) - Revisiting a Theorem of JP Kahane on Random Covering in R^d
Let ${\cal{C}}$ be a $d$-dimensional convex set of unit volume and assume that the numbers $1 > l_1 \ge l_2 \ge \ldots$ are given. Let ${\mathbb{T}^d}$ denote the $d$-dimensional torus of unit volume and equal sides (the $d$ unit cube with opposite faces identified) and let ${\bf{z}}_1,{\bf{z}}_2,\ldots$ be I.I.D. uniform points in ${\mathbb{T}^d}$. Further, let ${\bf{z}}_n \oplus (l_n \ast {\cal{C}})$ be a rescaled version of ${\cal{C}}$ {\em with volume} $l_n$ translated to $\bf{z}_n$. In this paper, we prove a necessary and sufficient condition for the union of homothetic copies ${\bf{z}}_n \oplus (l_n \ast {\cal{C}}), 1 \le n < \infty$, of ${\cal{C}}$ to cover every point of ${\mathbb{T}^d}$ infinitely often w.p. 1. This generalizes and extends a theorem of JP Kahane in \cite{kahane1990recouvrements} for a.s. covering.
Mengxin Xi (King's College London) - Extrapolation and Smoothing of Tempered Posteriors
In Bayesian computational statistics, estimating expectations with respect to the posterior distribution is often of interest. Estimation with the majority of Markov Chain Monte Calo (MCMC) algorithms can be computationally challenging when dealing with an informative posterior distribution. In these cases, sampling methods could fail to yield high-quality samples under limited computational budgets. A well-known approach addressing these issues for exploring full state space of complex distributions is tempering. In most approaches based on tempering, significant computational resources are devoted to sampling from the complex, untempered posterior p1, and quantities of interest are approximated using the obtained samples accordingly. This work focuses on methods for approximating intractable Bayesian posterior quantities of interest using tempered distributions. In paticular, our contribution reveals that approximating p1 may be unnecessary, as posterior quantities of interest can, in principle, be extrapolated based on their tempered equivalents with t < 1.