# OxWaSP Mini-Symposia 2017-18

#### Term 2 Mini-Symposia - B3.03, Zeeman

**Talks start at 2pm and finish at 4.30pm with a 30 minute break at 3.00pm**

**Friday 26 January, 2018**

* Jim Smith (Warwick) - François Caron (Oxford)*

*1400-1500 Mihaela van der Schaar (Oxford Man)*

**AutoPrognosis**

Mihaela's work uses data science and machine learning to create models that assist diagnosis and prognosis. Existing models suffer from two kinds of problems. Statistical models that are driven by theory/hypotheses are easy to apply and interpret but they make many assumptions and often have inferior predictive accuracy. Machine

learning models can be crafted to the data and often have superior predictive accuracy but they are often hard to interpret and must be crafted for each disease … and there are a lot of diseases. In this talk I present a method (AutoPrognosis) that makes machine learning itself do both the crafting and interpreting. For medicine, this is a complicated problem because missing data must be imputed, relevant features/covariates must be selected, and the most appropriate classifier(s) must be chosen. Moreover, there is no one “best” imputation algorithm or feature processing algorithm or classification algorithm; some imputation algorithms will work better with a particular

feature processing algorithm and a particular classifier in a particular setting. To deal with these complications, we need an entire pipeline. Because there are many pipelines we need a machine learning method forthis purpose, and this is exactly what AutoPrognosis is: an automated process for creating a particular pipeline for each particular setting. Using a variety of medical datasets, we show that AutoPrognosis achieves performance that is significantly superior to existing clinical approaches and statistical and machine learning methods.

*1530-1630 Jim Griffin (Kent)*

**Bayesian nonparametric vector autoregressive models
**Abstract: Vector autoregressive (VAR) models are the main work-horse model for macroeconomic forecasting, and provide a framework for the analysis of complex dynamics that are present between macroeconomic variables. Whether a classical or a Bayesian approach is adopted, most VAR models are linear with Gaussian innovations. This can limit the model’s ability to explain the relationships in macroeconomic series. We propose a nonparametric VAR model that allows for nonlinearity in the conditional mean, heteroscedasticity in the conditional variance, and non-Gaussian innovations. Our approach differs to that of previous studies by modelling the stationary and transition densities using Bayesian nonparametric methods. Our Bayesian nonparametric VAR (BayesNP-VAR) model is applied to US and UK macroeconomic time series, and compared to other Bayesian VAR models. We show that BayesNP-VAR is a flexible model that is able to account for nonlinear relationships as well as heteroscedasticity in the data. In terms of short-run out-of-sample forecasts, we show that BayesNP-VAR predictively outperforms competing models.

**Friday 9 February, 2018**

*Christian Robert (Warwick) - Chris Holmes (Oxford), Judith Rousseau (Oxford)*

*1400-1500 Jean-Michel Marin (University of Montpellier)*

**Machine learning tools for Bayesian inference on intractable likelihood models**

*As statistical models and data structures get increasingly complex, managing the likelihood function becomes a more and more frequent issue.
We now face many realistic fully parametric situations where the likelihood function cannot be computed in a reasonable time or simply is unavailable. As a result, while the corresponding parametric model is well-defined, standard solutions based on the likelihood function like Bayesian or maximum likelihood analyses are prohibitive to implement. To bypass this hurdle, the last decade witnessed different inferential strategies, among which composite likelihoods, indirect inference, GMMs and likelihood-free methods such as Approximate Bayesian Computation (ABC). We focus on the latest, and consider versions of ABC that consists in using machine learning algorithms on reference tables, tables that are simulated from the Bayesian model and used as learning set. Among all the possible machine learning strategies, two of them have exhibits good empirical behaviors: the random forests and the mixture density networks. In this talk, we describe these two methodologies and the associated ABC schemes.*

*1530-1630 Mathieu Gerber (Bristol)*

**Inference in generative models using the Wasserstein distance
**A growing range of generative statistical models are such the numerical evaluation of their likelihood functions is intractable.Approximate Bayesian computation and indirect inference have become popular approaches to overcome this issue, simulating synthetic data given parameters and comparing summaries of these simulations with the corresponding observed values. We propose to avoid these summaries and the ensuing loss of information through the use of Wasserstein distances between empirical distributions of observed and synthetic data. We describe how the approach can be used in the setting of dependent data such as time series, and how approximations of the Wasserstein distance allow the method to scale to large data sets. In particular, we propose a new approximation to the optimal assignment problem using the Hilbert space-filling curve. We provide an in-depth theoretical study, including consistency in the number of simulated data sets for a fixed number of observations and posterior concentration rates. The approach is illustrated with various examples, including a multivariate g-and-k distribution, a toggle switch model from systems biology, a queueing model, and a Levy-driven stochastic volatility model.

[This is joint work with B. Epsen, P. Jacob, and C. Robert.]

**Friday 23 February, 2018**

Arnaud Doucet (Oxford) - Adam Johansen (Warwick)

*1400-1500 George Deligiannidis (Oxford)*

**Piecewise Deterministic Markov Chain Monte Carlo methods**

Abstract: I will give an introduction to non-reversible MCMC algorithms, in particular algorithms based on piecewise deterministic processes. I will then give an overview of recent methodological progress and theoretical developments.

*1530-1630 Paul Jenkins (Warwick)*

**Inference from time-dependent feature allocation models
**In a feature allocation model, each data point is described by a collection of latent features, possibly unobserved. For example, we might classify a corpus of texts by describing each document via a set of topics; the topics then determine a distribution over words for that document. In this talk I will describe a new probabilistic model, the 'Wright-Fisher Indian Buffet Process', for collections of time-stamped documents. It is an extension of the popular Indian Buffet Process (IBP) from Bayesian nonparametrics and admits this process as its time-marginals. I will also describe a Markov Chain Monte Carlo algorithm for exact posterior inference and illustrate our construction by analysing the topics of NIPS conference papers over 12 years. This is joint work with Valerio Perrone (OxWaSP, Warwick), Dario Spano (Warwick), and Yee Whye Teh (Oxford).

**Friday 9 March, 2018**

*Yee Whye Teh (Oxford), Matt Kusner (Warwick), Cedric Archambeau (Oxford)*

1400- 1500 Shakir Mohamed (DeepMind, London)

Deep Generative Models

This talk will be a review of recent advances in deep generative models. Generative models are increasingly popular, and recent methods have combined the generality of probabilistic reasoning with the scalability of deep learning, to develop learning algorithms that have been applied to a wide variety of problems giving state-of-the-art results, in image generation, text-to-speech synthesis, and image captioning, amongst many others. Advances in deep generative models are at the forefront of deep learning research because of the promise they hold for allowing data-efficient learning, and for model-based reinforcement learning. We will begin by covering three of the active types of models, Markov models, latent variable models and implicit models, and then explore how these models can be scaled to high-dimensional data. I hope to expose some of the questions that remain in this area, and for which there remains a great deal of opportunity for further research.

*Marc Diesenroth (Imperial College London)*

**Data-Efficient Learning for Autonomous Robots**

Abstract: Trial-and-error based reinforcement learning (RL) has seen rapid advancements in recent times, especially with the advent of deep neural networks. However, the majority of autonomous RL algorithms either rely on engineered features or a large number of interactions with the environment. Such a large number of interactions may be impractical in many real-world applications. For example, robots are subject to wear and tear and, hence, millions of interactions may change or damage the system.

To address this problem, current learning approaches typically require task-specific knowledge in form of expert demonstrations, pre-shaped policies, or the underlying dynamics. In the first part of the talk, I follow a different approach and speed up learning by efficiently extracting information from sparse data. In particular, we propose to learn a probabilistic, non-parametric Gaussian process dynamics model.

By explicitly incorporating model uncertainty in long-term planning and controller learning my approach reduces the effects of model errors, a key problem in model-based learning. Compared to state-of-the art reinforcement learning our model-based policy search method achieves an unprecedented speed of learning, which makes is most promising for application to real systems. We demonstrate its applicability to autonomous learning from scratch on real robot and control tasks. To reduce the number of system interactions while naturally handling state or control constraints, we extend the above framework and propose a model-based RL framework based on Model Predictive Control (MPC) using learned probabilistic dynamics models. We provide theoretical guarantees for the first-order optimality in the GP-based transition models with deterministic approximate inference for long-term planning. The proposed framework demonstrates superior data efficiency and learning rates compared to the current state of the art.

Key references:

[1] Marc P. Deisenroth, Dieter Fox, Carl E. Rasmussen, Gaussian Processes for Data-Efficient Learning in Robotics and Control, IEEE Transactions on Pattern Analysis and Machine Intelligence, volume 37, pp. 408–423, 2015

https://arxiv.org/abs/1502.02860

[2] Sanket Kamthe, Marc P. Deisenroth, Data-Efficient Reinforcement Learning with Probabilistic Model Predictive Control, AISTATS 2018

https://arxiv.org/abs/1706.06491

_______________________________________________________________________________________________

#### Term 1 Mini-Symposia - OC1.01, The Occulus

**Talks start at 2pm and finish at 4.30pm with a 30 minute break at 3.00pm**

**Friday 20 October, 2017
**

*1400-1500 Sara Wade (Warwick, sara.wade@warwick.ac.uk)*

**Adaptive truncation of a Bayesian nonparametric multivariate regression model for a study of fertility and partnership patterns of Colombian women**

Abstract: We propose a flexible Bayesian nonparametric multivariate regression model, which can capture nonlinear regression functions and the presence of non-normal errors, such as heavy tails or multi-modality. The infinite mixture model has interpretable covariate-dependent weights constructed through normalization, allowing for combinations of both discrete and continuous covariates, and extends the model developed in (Antoniano-Villalobos et al., 2014) for a multivariate response. The infinite number of components and intractable normalizing constant pose computational difficulties, which are overcome through an adaptive truncation algorithm (Griffin, 2014). The algorithm combines adaptive Metropolis-Hastings with sequential Monte Carlo to create a sequence of truncated posteriors and automatically determines the level of truncation. The model and algorithm are applied to a lifestyle study on Colombian women, which aims to understand the relationship between some focal life events (e.g. age at first sexual intercourse, relationship, child, presence in the labour market) and various baseline factors, such as year of birth, region of birth, and ethnicity. Regression function and conditional density estimates are presented, along with an analysis of the implied covariate-dependent clustering.

* 1530-1630 Yi Yu (Bristol, yuyi1226@gmail.com)*

**Covariance change point detection in high dimension settings**

Abstract: In this paper, we tackle the high dimensional covariance change point detection problem without extra assumptions on the covariance structure, where the dimension $p$ is smaller than the sample size $n$ but $p$ is allowed to diverge as $n \to \infty$; to be specific, the observations $X_i \in \mathbb{R}^p$, $i = 1, \ldots, n$ are independent sub-Gaussian random vectors with covariance $\Sigma_i$, and $\bigl\{\Sigma_i\bigr\}_{i=1}^n$ are piecewise constant. Methods based on binary segmentation \citep[e.g.][]{vostrikova1981detection} and wild binary segmentation \citep{fryzlewicz2014wild} are introduced, with the consistency results coincide with those of the mean change point detection problems. In addition, we also propose a variant of wild binary segmentation using random projection, namely wild binary segmentation with random projection, the change point estimator location rate of which is improved and is proved to be optimal in the minimax sense.

**Friday 3 November, 2017**

*1400-1500 Francesco Cappuccio (University of Warwick) Professor of Cardiovascular Medicine & Epidemiology*

**Sleep, hypertension and obesity: exploring causal relationships**

Sleep duration is affected by a variety of cultural, social, psychological, behavioural, pathophysiological and environmental influences. Changes in modern society—like longer working hours, more shift-work, 24/7 availability of commodities and 24-h global connectivity—have been associated with a gradual reduction in sleep duration and sleeping patterns across westernised populations. We review the evidence of an association between sleep disturbances and the development of cardio-metabolic risk and disease and discuss the implications for causality of these associations.

*1530-1630 Laura Bonnett (University of Liverpool, NIHR Post-Doctoral Fellow)
*

**Modelling Recurrent Seizures in Epilepsy**

Epilepsy is defined as the tendency to have recurrent seizures.

However, clinical trials in this area often model time to a specific event such as treatment failure, or 12-month remission from seizures, rather than considering all post-randomisation seizures. This talk will compare three different models for risk of seizure recurrence post-randomisation using all recorded seizures within the Standard Versus New Antiepileptic Drug study. Innovative graphical methods will be demonstrated which aid comparisons across models and assess model fits utilising predictions from the models.

**Friday 17 November, 2017**

* 1400-1500 Mathew Penrose (M.D.Penrose@bath.ac.uk)*

**Random geometric graphs: a survey**

Abstract: Graphs and networks are ubiquitous in modern science and society, and random graph models play a major role in underpinning the statistical analysis of network data. One such model is the random geometric graph: a large number of vertices are scattered randomly in a spatial region, and two vertices are connected by an edge whenever they are sufficiently close together.

This is a very natural model for networks with spatial content and randomness; for example mobile communictions networks.

In this talk we address some of the basic questions one might ask about such a graph, such as the following.

Is it connected; that is, are any two vertices connected by a path?

If not, how many pieces does the graph split into?

How large are these pieces? In particular, does the largest piece contain a significant proportion of the of vertices?

How many edges, trianglular subgraphs and so on does the graph have?

We shall also explore the relationship with other random graph models, so far as time permits.

*1530-1630 Paul Chleboun (paul.chleboun@stats.ox.ac.uk)*

**The dynamics of spin plaquette models.**

Abstract: We will examine the dynamics of a certain class of continuous time Markov process called spin plaquette models. These are finite range spin systems which evolve according to Glauber dynamics. They have recently attracted a lot of attention in the physics literature because they are expected to exhibit glassy behavior, despite the absence of any disorder (like spin glasses) or hard constraints in the dynamics (like kinetically constrained models).

We will first discuss equilibrium results related to the stationary distribution of these models, and then discuss some results on the dynamical behaviour. In particular we will look at the rate of convergence to the stationary distribution and discuss some of the tools used to derive useful bounds on the speed of convergence.

This is work in progress with A. Faggionato, F. Martinelli, C. Toninelli, and A. Smith

**Friday 1 December, 2017**

*1400-1500 Sofia Olhede (s.olhede@ucl.ac.uk)*

**Statistical Analysis of Network Data**

Abstract: In recent years the ready availability of network data has fuelled statistical interest in the field. There are many open problems, but also interesting developments. This talk will cover the concept of networks as data, rather than as an underlying but unobserved model structure, and popular models such as degree-based models, the stochastic block model, and generalisations thereof. Given time, I will also briefly touch on multiplex networks and hyper graphs.

*1530-1630 Thomas Hills (thomhills@gmail.com)*

**Using network analysis of language to understand cognition**

Abstract: Language has structure. Network analysis can help us evaluate what aspects of this structure best predict language learning, learning deficits, age-related cognitive decline, and general trends in mental change across the lifespan. I will discuss some of my recent work in this area, including computational models of network growth and cognitive navigation, and describe some of the problems that remain to be solved, including network representations of natural language and why too much information may be just as bad for minds as too little.