Coronavirus (Covid-19): Latest updates and information
Skip to main content Skip to navigation

Regular Seminars

Welcome to CRiSM seminar series!

Seminars take place biweekly during term time between 2 and 3pm, and will be hosted on Microsoft Teams.

We encourage all postgraduate students (MSc and PhD) to attend this series: it is a great opportunity to know more about current research within the department and outside.

CRiSM seminars 2020/21 are organised by Tom Berrett.

The following talks are scheduled for the academic year 2020/2021:

Term 1

  • Wednesday October 28 - François Caron - University of Oxford

  • Thursday November 12 - Stefan Wager - Stanford University

  • Thursday November 26 (10am) - Wai Kin Wong - Hong Kong Observatory

  • Thursday 10 December - Sofia Olhede - EPFL Lausanne

Term 2 (on Wednesdays)

Term 3

Titles and abstracts will appear below as soon as they are available.


October 28: François Caron (University of Oxford) Meeting Link

Title: Non-exchangeable random partition models for microclustering

Abstract: Many popular random partition models, such as the Chinese restaurant process and its two-parameter extension, fall in the class of exchangeable random partitions, and have found wide applicability in model-based clustering, population genetics, ecology or network analysis. While the exchangeability assumption is sensible in many cases, it has some strong implications. In particular, Kingman’s representation theorem implies that the size of the clusters necessarily grows linearly with the sample size; this feature may be undesirable for some applications, as recently pointed out by Miller et al. (2015). We present here a flexible class of non-exchangeable random partition models which are able to generate partitions whose cluster sizes grow sublinearly with the sample size, and where the growth rate is controlled by one parameter. Along with this result, we provide the asymptotic behaviour of the number of clusters of a given size, and show that the model can exhibit a power-law behaviour, controlled by another parameter. The construction is based on completely random measures and a Poisson embedding of the random partition, and inference is performed using a Sequential Monte Carlo algorithm. Additionally, we show how the model can also be directly used, by relaxing the exchangeability assumption in edge-exchangeable models, to obtain a class of sparse multigraphs with power-law degree distribution and sublinear growth of the node degrees. Finally, experiments on real datasets emphasize the usefulness of the approach compared to a two-parameter Chinese restaurant process.

Joint work with Giuseppe di Benedetto and Yee Whye Teh


November 12: Stefan Wager (Stanford University) Meeting Link

Title: Noise-induced randomization in regression discontinuity designs

Abstract: Regression discontinuity designs are used to estimate causal effects in settings where treatment is determined by whether an observed running variable crosses a pre-specified threshold. While the resulting sampling design is sometimes described as akin to a locally randomized experiment in a neighborhood of the threshold, standard formal analyses do not make reference to probabilistic treatment assignment and instead identify treatment effects via continuity arguments. Here we propose a new approach to identification, estimation, and inference in regression discontinuity designs that exploits measurement error in the running variable. Under an assumption that the measurement error is exogenous, we show how to consistently estimate causal effects using a class of linear estimators that weight treated and control units so as to balance a latent variable of which the running variable is a noisy measure. We find this approach to facilitate identification of both familiar estimands from the literature, as well as policy-relevant estimands that correspond to the effects of realistic changes to the existing treatment assignment rule. We demonstrate the method with a study of retention of HIV patients and evaluate its performance using simulated data and a regression discontinuity design artificially constructed from test scores in early childhood. 


November 26 at 10am: Wai Kin Wong (Hong Kong Observatory) Meeting Link

Title: Machine learning in rainfall nowcasting.

Abstract: Rainfall nowcasting refers to the prediction of precipitation in very high spatial and temporal resolutions for the next 1-6 hours. Timely and quality rainfall nowcast provides indispensable source of information in support of rainstorm monitoring, alerting or warning systems that are invaluable to weather services, and disaster risk reduction of high-impact weather or rainstorms for protecting people's lives. In Hong Kong Observatory (HKO), artificial intelligence technique based on image processing algorithms have been utilized in the in-house nowcasting system, namely SWIRLS (Short-range Warning of Intense Rainstorms in Localised Systems) to track the motion of precipitation systems detected by weather radars, followed by predicting their future location and rainfall using the motion field. However, the intensity is assumed to remain unchanged in computation that results in decreasing skill of precipitation forecast beyond one or two hours. In recent years, novel deep learning (DL) based methods have been developed for precipitation nowcasting that have shown improved performance compared to the above operational algorithm. In this talk, the current progress of DL based methods for precipitation nowcasting will be introduced, including mathematical formulation of precipitation nowcasting as a spatiotemporal sequence forecasting problem, and a couple of general learning strategies. Performance of DL based nowcasting model and a systematic benchmark for performance evaluation will be presented. Finally, future research directions on development of DL in precipitation nowcasting and meteorological forecasting applications are discussed.


December 10: Sofia Olhede (EPFL Lausanne)

Title: Modeling networks and network populations via graph distances

Abstract: Networks have become a key form of data. Networks allow us to dependence between nodes or actors. Understanding the difference between two networks is also challenging unless they share nodes and are of the same size. We shall discuss how we may compare networks and also consider the regime where more than one network is observed.

We shall also discuss how to parametrize a distribution on labelled graphs in terms of a Frechét mean graph (which depends on a user-specified choice of metric or graph distance) and a parameter that controls the concentration of this distribution about its mean. Entropy is the natural parameter for such control, varying from a point mass concentrated on the Frechét mean itself to a uniform distribution over all graphs on a given vertex set.

Networks present many new statistical challenges. We shall discuss how to resolve these challenges respecting the fundamental non-Euclidean nature of network observations.

This is joint work with Simon Lunagomez (Lancaster University) and Patrick Wolfe (Purdue University).


January 20: Pragya Sur (Harvard University)

Title: A precise high-dimensional asymptotic theory for AdaBoost

Abstract: This talk will introduce a precise high-dimensional asymptotic theory for AdaBoost on separable data, taking both statistical and computational perspectives. We will consider the common modern setting where the number of features p and the sample size n are both large and comparable, and in particular, look at scenarios where the data is separable in an asymptotic sense. Under a class of statistical models, we will provide an (asymptotically) exact analysis of the generalization error of AdaBoost, when the algorithm interpolates the training data and maximizes an empirical L1 margin. On the computational front, we provide a sharp analysis of the stopping time when boosting approximately maximizes the empirical L1 margin. Our theory provides several insights into properties of Boosting; for instance, the larger the dimensionality ratio p/n, the faster the optimization reaches interpolation. At the heart of our theory lies an in-depth study of the maximum L1-margin, which can be accurately described by a new system of non-linear equations; we analyze this margin and the properties of this system, using Gaussian comparison techniques and a novel uniform deviation argument. Time permitting, I will present a new class of boosting algorithms that correspond to Lq geometry, for q>1, together with results on their high-dimensional generalization and optimization behavior. This is based on joint work with Tengyuan Liang.


February 3: Rina Foygel Barber (University of Chicago)

Title: Testing goodness-of-fit and conditional independence with approximate co-sufficient sampling

Abstract: Goodness-of-fit (GoF) testing is ubiquitous in statistics, with direct ties to model selection, confidence interval construction, conditional independence testing, and multiple testing, just to name a few applications. While testing the GoF of a simple (point) null hypothesis provides an analyst great flexibility in the choice of test statistic while still ensuring validity, most GoF tests for composite null hypotheses are far more constrained, as the test statistic must have a tractable distribution over the entire null model space. A notable exception is co-sufficient sampling (CSS): resampling the data conditional on a sufficient statistic for the null model guarantees valid GoF testing using any test statistic the analyst chooses. But CSS testing requires the null model to have a compact (in an information-theoretic sense) sufficient statistic, which only holds for a very limited class of models; even for a null model as simple as logistic regression, CSS testing is powerless. In this paper, we leverage the concept of approximate sufficiency to generalize CSS testing to essentially any parametric model with an asymptotically-efficient estimator; we call our extension “approximate CSS” (aCSS) testing. We quantify the finite-sample Type I error inflation of aCSS testing and show that it is vanishing under standard maximum likelihood asymptotics, for any choice of test statistic. We apply our proposed procedure both theoretically and in simulation to a number of models of interest to demonstrate its finite-sample Type I error and power.

This work is joint with Lucas Janson.


February 17: Alexandra Carpentier (Otto-von-Guericke-Universität Magdeburg)

Title: Several structured thresholding bandit problems

Abstract: In this talk we will discuss the thresholding bandit problem, i.e. a sequential learning setting where the learner samples sequentially K unknown distributions for T times, and aims at outputting at the end the set of distributions whose means \mu_k are above a threshold \tau. We will study this problem under four structural assumptions, i.e. shape constraints: that the sequence of means is monotone, unimodal, concave, or unstructured (vanilla case). We will provide in each case minimax results on the performance of any strategies, as well as matching algorithms. This will highlight the fact that even more than in batch learning, structural assumptions have a huge impact in sequential learning. This work is based on a joint work with James Cheshire and Pierre Menard (, and with Andrea Locatelli and Maurilio Gutzeit (


March 3: Liza Levina (University of Michigan) Meeting Link

Title: Hierarchical community detection by recursive partitioning

Abstract: Community detection in networks has been extensively studied in the form of finding a single partition into a “correct” number of communities. In large networks, however, a multi-scale hierarchy of communities is much more realistic. We show that a hierarchical tree of communities, obviously more interpretable, is also potentially more accurate and more computationally efficient. We construct this tree with a simple top-down recursive algorithm, at each step splitting the nodes into two communities with a non-iterative spectral algorithm, until a stopping rule suggests there are no more communities. The algorithm is model-free, extremely fast, and requires no tuning other than selecting a stopping rule. We propose a natural model for this setting, a binary tree stochastic block model, and prove that the algorithm correctly recovers the entire community tree under relatively mild assumptions. As a by-product, we obtain explicit and intuitive results for fitting the stochastic block model under model misspecification. We illustrate the algorithm on a statistics papers dataset constructing a highly interpretable tree of statistics research communities, and on a network based on gene co-occurrence in research papers on anemia.

Joint work with Tianxi Li, Lihua Lei, Sharmodeep Bhattacharyya, Koen van de Berge, Purnamrita Sarkar, and Peter Bickel.


March 17: Peter Bühlmann (ETH Zürich) Meeting Link

Title: Deconfounding

Abstract: Hidden confounding is a severe problem when interpreting regression or causal parameters, and it may also lead to poor generalisation performance for prediction. Adjusting for unobserved confounding is important but challenging when based on observational data only. We propose spectral deconfounding, a class of linear data transformations, followed by standard sparse estimation methods such as the Lasso, or the Debiased Lasso when confidence guarantees are required. The proposed methodology has provable (optimality) properties when assuming dense confounding. Without additional assumptions, deconfounding from observational data is impossible. But we argue that even when such assumptions fail to hold, certain methods exhibit partial robustness against hidden confounding.

The talk is based on joint work with Domagoj Cevid, Zijian Guo and Nicolai Meinshausen.

Select tags to filter on
  Jump to any date
Wed 5 May, '21
CRiSM Seminars
via Teams