Events
Thu 3 Jan, '19 
AS&RU CPS Network MeetingMB1.05 

Wed 9 Jan, '19 
Dept Council Meeting  Stats Common room 

Fri 11 Jan, '19 
Algorithms SeminarMB0.08 

Tue 15 Jan, '19 
YRMStatistics Common Room 

Wed 16 Jan, '19 
SSLC 13:0015:00MSB2.22 

Thu 17 Jan, '19 
Management GroupMB1.06 

Thu 17 Jan, '19 
CRiSM SeminarMSB2.23Prof. Galin Jones, School of Statistics, University of Minnesota (14:0015:00) Bayesian Spatiotemporal Modeling Using Hierarchical Spatial Priors, with Applications to Functional Magnetic Resonance Imaging We propose a spatiotemporal Bayesian variable selection model for detecting activation in functional magnetic resonance imaging (fMRI) settings. Following recent research in this area, we use binary indicator variables for classifying active voxels. We assume that the spatial dependence in the images can be accommodated by applying an areal model to parcels of voxels. The use of parcellation and a spatial hierarchical prior (instead of the popular Ising prior) results in a posterior distribution amenable to exploration with an efficient Markov chain Monte Carlo (MCMC) algorithm. We study the properties of our approach by applying it to simulated data and an fMRI data set. Dr. Flavio Goncalves, Universidade Federal de Minas Gerais, Brazil (15:0016:00). Exact Bayesian inference in spatiotemporal Cox processes driven by multivariate Gaussian processes In this talk we present a novel inference methodology to perform Bayesian inference for spatiotemporal Cox processes where the intensity function depends on a multivariate Gaussian process. Dynamic Gaussian processes are introduced to allow for evolution of the intensity function over discrete time. The novelty of the method lies on the fact that no discretisation error is involved despite the nontractability of the likelihood function and infinite dimensionality of the problem. The method is based on a Markov chain Monte Carlo algorithm that samples from the joint posterior distribution of the parameters and latent variables of the model. The models are defined in a general and flexible way but they are amenable to direct sampling from the relevant distributions, due to careful characterisation of its components. The models also allow for the inclusion of regression covariates and/or temporal components to explain the variability of the intensity function. These components may be subject to relevant interaction with space and/or time. Real and simulated examples illustrate the methodology, followed by concluding remarks. 

Thu 17 Jan, '19 
Machine Learning SeminarMB2.23 

Fri 18 Jan, '19 
Algorithms SeminarMB0.08 

Tue 22 Jan, '19 
YRMStatistics Common Room 

Wed 23 Jan, '19 
Teaching Committee  13:0016:00MSB2.22 

Thu 24 Jan, '19 
Machine Learning SeminarMB2.22 

Fri 25 Jan, '19 
Algorithms SeminarMB0.08 

Tue 29 Jan, '19 
Reader in Actuarial Statistics  PresentationMB2.23 

Tue 29 Jan, '19 
YRMStatistics Common Room 

Tue 29 Jan, '19 
WCCMB1.05 

Wed 30 Jan, '19 
Management GroupMB1.06 

Thu 31 Jan, '19 
CRiSM SeminarMSB2.23Professor Paul Fearnhead, Lancaster University  14:001500 Efficient Approaches to Changepoint Problems with Dependence Across Segments Changepoint detection is an increasingly important problem across a range of applications. It is most commonly encountered when analysing timeseries data, where changepoints correspond to points in time where some feature of the data, for example its mean, changes abruptly. Often there are important computational constraints when analysing such data, with the number of data sequences and their lengths meaning that only very efficient methods for detecting changepoints are practically feasible. A natural way of estimating the number and location of changepoints is to minimise a cost that tradesoff a measure of fit to the data with the number of changepoints fitted. There are now some efficient algorithms that can exactly solve the resulting optimisation problem, but they are only applicable in situations where there is no dependence of the mean of the data across segments. Using such methods can lead to a loss of statistical efficiency in situations where e.g. it is known that the change in mean must be positive. This talk will present a new class of efficient algorithms that can exactly minimise our cost whilst imposing certain constraints on the relationship of the mean before and after a change. These algorithms have links to recursions that are seen for discretestate hidden Markov Models, and within sequential Monte Carlo. We demonstrate the usefulness of these algorithms on problems such as detecting spikes in calcium imaging data. Our algorithm can analyse data of length 100,000 in less than a second, and has been used by the Allen Brain Institute to analyse the spike patterns of over 60,000 neurons. (This is joint work with Toby Hocking, Sean Jewell, Guillem Rigaill and Daniela Witten.) Dr. Sandipan Roy, Department of Mathematical Science, University of Bath (15:0016:00) Network Heterogeneity and Strength of Connections Abstract: Detecting strength of connection in a network is a fundamental problem in understanding the relationship among individuals. Often it is more important to understand how strongly the two individuals are connected rather than the mere presence/absence of the edge. This paper introduces a new concept of strength of connection in a network through a nonparameteric object called “Grafield”. “Grafield” is a piecewise constant bivariate kernel function that compactly represents the affinity or strength of ties (or interactions) between every pair of vertices in the graph. We estimate the “Grafield” function through a spectral analysis of the Laplacian matrix followed by a hard thresholding (Gavish & Donoho, 2014) of the singular values. Our estimation methodology is valid for asymmetric directed network also. As a by product we get an efficient procedure for edge probability matrix estimation as well. We validate our proposed approach with several synthetic experiments and compare with existing algorithms for edge probability matrix estimation. We also apply our proposed approach to three real datasets understanding the strength of connection in (a) a social messaging network, (b) a network of political parties in US senate and (c) a neural network of neurons and synapses in C. elegans, a type of worm. 

Fri 1 Feb, '19 
Algorithms SeminarMB0.08 

Tue 5 Feb, '19 
YRMStatistics Common Room 

Thu 7 Feb, '19 
AS&RU CPS Network MeetingMB1.05 

Fri 8 Feb, '19 
Algorithms SeminarMB0.08 

Mon 11 Feb, '19 
Teaching ForumStats Common Room 

Tue 12 Feb, '19 
YRMStatistics Common Room 

Wed 13 Feb, '19 
Management GroupMB1.05 

Thu 14 Feb, '19 
Research SSLCMB1.06 

Thu 14 Feb, '19 
CRiSM SeminarMSB2.23Philipp Hermann, Institute of Applied Statistics, Johannes Kepler University Linz, Austria Time: 14:0015:00 LDJump: Estimating Variable Recombination Rates from Population Genetic Data Recombination is a process during meiosis which starts with the formation of DNA doublestrand breaks and results in an exchange of genetic material between homologous chromosomes. In many species, recombination is concentrated in narrow regions known as hotspots, flanked by large zones with low recombination. As recombination plays an important role in evolution, its estimation and the identification of hotspot positions is of considerable interest. In this talk we introduce LDJump, our method to estimate local population recombination rates with relevant summary statistics as explanatory variables in a regression model. More precisely, we divide the DNA sequence into small segments and estimate the recombination rate per segment via the regression model. In order to obtain changepoints in recombination we apply a frequentist segmentation method. This approach controls a type I error and provides confidence bands for the estimator. Overall LDJump identifies hotspots at high accuracy under different levels of genetic diversity as well as demography and is computationally fast even for genomic regions spanning many megabases. We will present a practical application of LDJump on a region of the human chromosome 21 and compare our estimated population recombination rates with experimentally measured recombination events. (joint work with Andreas Futschik, Irene TiemannBoege, and Angelika Heissl) Professor Dr. Ingo Scholtes, Data Analytics Group, University of Zürich Time: 15:0016:00 Optimal HigherOrder Network Analytics for Time Series Data Networkbased data analysis techniques such as graph mining, social network analysis, link prediction and clustering are an important foundation for data science applications in computer science, computational social science, economics and bioinformatics. They help us to detect patterns in large corpora of data that capture relations between genes, brain regions, species, humans, documents, or financial institutions. While this potential of the network perspective is undisputed, advances in data sensing and collection increasingly provide us with highdimensional, temporal, and noisy data on real systems. The complex characteristics of such data sources pose fundamental challenges for network analytics. They question the validity of network abstractions of complex systems and pose a threat for interdisciplinary applications of data analytics and machine learning. To address these challenges, I introduce a graphical modelling framework that accounts for the complex characteristics of realworld data on complex systems. I demonstrate this approach in time series data on technical, biological, and social systems. Current methods to analyze the topology of such systems discard information on the timing and ordering of interactions, which however determines which elements of a system can influence each other via paths. To solve this issue, I introduce a modelling framework that (i) generalises standard network representations towards multiorder graphical models for causal paths, and (ii) uses statistical learning to achieve an optimal balance between explanatory power and model complexity. The framework advances the theoretical foundation of data science and sheds light on the important question when network representations of time series data are justified. It is the basis for a new generation of data analytics and machine learning techniques that account both for temporal and topological characteristics in realworld data. 

Fri 15 Feb, '19 
Algorithms SeminarMB0.08 

Tue 19 Feb, '19 
YRMStatistics Common Room 

Wed 20 Feb, '19 
Taught SSLCtbc 