# OxWaSP Mini-Symposia 2017-18

#### Term 2 Mini-Symposia - B3.03, Zeeman

**Talks start at 2pm and finish at 4.30pm with a 30 minute break at 3.00pm**

**Friday 26 January, 2017**

* Jim Smith (Warwick) - François Caron (Oxford)*

TBA

**Friday 9 February, 2017**

*1400-1500 Christian Robert (Warwick) - Chris Holmes (Oxford), Judith Rousseau (Oxford)*

TBA

*1530-1630 Mathieu Gerber (Bristol)*

**Inference in generative models using the Wasserstein distance
**A growing range of generative statistical models are such the numerical evaluation of their likelihood functions is intractable.Approximate Bayesian computation and indirect inference have become popular approaches to overcome this issue, simulating synthetic data given parameters and comparing summaries of these simulations with the corresponding observed values. We propose to avoid these summaries and the ensuing loss of information through the use of Wasserstein distances between empirical distributions of observed and synthetic data. We describe how the approach can be used in the setting of dependent data such as time series, and how approximations of the Wasserstein distance allow the method to scale to large data sets. In particular, we propose a new approximation to the optimal assignment problem using the Hilbert space-filling curve. We provide an in-depth theoretical study, including consistency in the number of simulated data sets for a fixed number of observations and posterior concentration rates. The approach is illustrated with various examples, including a multivariate g-and-k distribution, a toggle switch model from systems biology, a queueing model, and a Levy-driven stochastic volatility model.

[This is joint work with B. Epsen, P. Jacob, and C. Robert.]

**Friday 23 February, 2017**

*1400-1500 George Deligiannidis (Oxford)*

*TBA*

*1530-1630 Paul Jenkins (Warwick)*

**Inference from time-dependent feature allocation models
**In a feature allocation model, each data point is described by a collection of latent features, possibly unobserved. For example, we might classify a corpus of texts by describing each document via a set of topics; the topics then determine a distribution over words for that document. In this talk I will describe a new probabilistic model, the 'Wright-Fisher Indian Buffet Process', for collections of time-stamped documents. It is an extension of the popular Indian Buffet Process (IBP) from Bayesian nonparametrics and admits this process as its time-marginals. I will also describe a Markov Chain Monte Carlo algorithm for exact posterior inference and illustrate our construction by analysing the topics of NIPS conference papers over 12 years. This is joint work with Valerio Perrone (OxWaSP, Warwick), Dario Spano (Warwick), and Yee Whye Teh (Oxford).

**Friday 9 March, 2017**

*Yee Whye Teh (Oxford), Matt Kusner (Warwick), Cedric Archambeau (Oxford)*

TBA

_______________________________________________________________________________________________

#### Term 1 Mini-Symposia - OC1.01, The Occulus

**Talks start at 2pm and finish at 4.30pm with a 30 minute break at 3.00pm**

**Friday 20 October, 2017
**

*1400-1500 Sara Wade (Warwick, sara.wade@warwick.ac.uk)*

**Adaptive truncation of a Bayesian nonparametric multivariate regression model for a study of fertility and partnership patterns of Colombian women**

Abstract: We propose a flexible Bayesian nonparametric multivariate regression model, which can capture nonlinear regression functions and the presence of non-normal errors, such as heavy tails or multi-modality. The infinite mixture model has interpretable covariate-dependent weights constructed through normalization, allowing for combinations of both discrete and continuous covariates, and extends the model developed in (Antoniano-Villalobos et al., 2014) for a multivariate response. The infinite number of components and intractable normalizing constant pose computational difficulties, which are overcome through an adaptive truncation algorithm (Griffin, 2014). The algorithm combines adaptive Metropolis-Hastings with sequential Monte Carlo to create a sequence of truncated posteriors and automatically determines the level of truncation. The model and algorithm are applied to a lifestyle study on Colombian women, which aims to understand the relationship between some focal life events (e.g. age at first sexual intercourse, relationship, child, presence in the labour market) and various baseline factors, such as year of birth, region of birth, and ethnicity. Regression function and conditional density estimates are presented, along with an analysis of the implied covariate-dependent clustering.

* 1530-1630 Yi Yu (Bristol, yuyi1226@gmail.com)*

**Covariance change point detection in high dimension settings**

Abstract: In this paper, we tackle the high dimensional covariance change point detection problem without extra assumptions on the covariance structure, where the dimension $p$ is smaller than the sample size $n$ but $p$ is allowed to diverge as $n \to \infty$; to be specific, the observations $X_i \in \mathbb{R}^p$, $i = 1, \ldots, n$ are independent sub-Gaussian random vectors with covariance $\Sigma_i$, and $\bigl\{\Sigma_i\bigr\}_{i=1}^n$ are piecewise constant. Methods based on binary segmentation \citep[e.g.][]{vostrikova1981detection} and wild binary segmentation \citep{fryzlewicz2014wild} are introduced, with the consistency results coincide with those of the mean change point detection problems. In addition, we also propose a variant of wild binary segmentation using random projection, namely wild binary segmentation with random projection, the change point estimator location rate of which is improved and is proved to be optimal in the minimax sense.

**Friday 3 November, 2017**

*1400-1500 Francesco Cappuccio (University of Warwick) Professor of Cardiovascular Medicine & Epidemiology*

**Sleep, hypertension and obesity: exploring causal relationships**

Sleep duration is affected by a variety of cultural, social, psychological, behavioural, pathophysiological and environmental influences. Changes in modern society—like longer working hours, more shift-work, 24/7 availability of commodities and 24-h global connectivity—have been associated with a gradual reduction in sleep duration and sleeping patterns across westernised populations. We review the evidence of an association between sleep disturbances and the development of cardio-metabolic risk and disease and discuss the implications for causality of these associations.

*1530-1630 Laura Bonnett (University of Liverpool, NIHR Post-Doctoral Fellow)
*

**Modelling Recurrent Seizures in Epilepsy**

Epilepsy is defined as the tendency to have recurrent seizures.

However, clinical trials in this area often model time to a specific event such as treatment failure, or 12-month remission from seizures, rather than considering all post-randomisation seizures. This talk will compare three different models for risk of seizure recurrence post-randomisation using all recorded seizures within the Standard Versus New Antiepileptic Drug study. Innovative graphical methods will be demonstrated which aid comparisons across models and assess model fits utilising predictions from the models.

**Friday 17 November, 2017**

* 1400-1500 Mathew Penrose (M.D.Penrose@bath.ac.uk)*

**Random geometric graphs: a survey**

Abstract: Graphs and networks are ubiquitous in modern science and society, and random graph models play a major role in underpinning the statistical analysis of network data. One such model is the random geometric graph: a large number of vertices are scattered randomly in a spatial region, and two vertices are connected by an edge whenever they are sufficiently close together.

This is a very natural model for networks with spatial content and randomness; for example mobile communictions networks.

In this talk we address some of the basic questions one might ask about such a graph, such as the following.

Is it connected; that is, are any two vertices connected by a path?

If not, how many pieces does the graph split into?

How large are these pieces? In particular, does the largest piece contain a significant proportion of the of vertices?

How many edges, trianglular subgraphs and so on does the graph have?

We shall also explore the relationship with other random graph models, so far as time permits.

*1530-1630 Paul Chleboun (paul.chleboun@stats.ox.ac.uk)*

**The dynamics of spin plaquette models.**

Abstract: We will examine the dynamics of a certain class of continuous time Markov process called spin plaquette models. These are finite range spin systems which evolve according to Glauber dynamics. They have recently attracted a lot of attention in the physics literature because they are expected to exhibit glassy behavior, despite the absence of any disorder (like spin glasses) or hard constraints in the dynamics (like kinetically constrained models).

We will first discuss equilibrium results related to the stationary distribution of these models, and then discuss some results on the dynamical behaviour. In particular we will look at the rate of convergence to the stationary distribution and discuss some of the tools used to derive useful bounds on the speed of convergence.

This is work in progress with A. Faggionato, F. Martinelli, C. Toninelli, and A. Smith

**Friday 1 December, 2017**

*1400-1500 Sofia Olhede (s.olhede@ucl.ac.uk)*

**Statistical Analysis of Network Data**

Abstract: In recent years the ready availability of network data has fuelled statistical interest in the field. There are many open problems, but also interesting developments. This talk will cover the concept of networks as data, rather than as an underlying but unobserved model structure, and popular models such as degree-based models, the stochastic block model, and generalisations thereof. Given time, I will also briefly touch on multiplex networks and hyper graphs.

*1530-1630 Thomas Hills (thomhills@gmail.com)*

**Using network analysis of language to understand cognition**

Abstract: Language has structure. Network analysis can help us evaluate what aspects of this structure best predict language learning, learning deficits, age-related cognitive decline, and general trends in mental change across the lifespan. I will discuss some of my recent work in this area, including computational models of network growth and cognitive navigation, and describe some of the problems that remain to be solved, including network representations of natural language and why too much information may be just as bad for minds as too little.