Skip to main content Skip to navigation

Events

Select tags to filter on
  More events Jump to any date

Search calendar

Enter a search term into the box below to search for all events matching those terms.

Start typing a search term to generate results.

How do I use this calendar?

You can click on an event to display further information about it.

The toolbar above the calendar has buttons to view different events. Use the left and right arrow icons to view events in the past and future. The button inbetween returns you to today's view. The button to the right of this shows a mini-calendar to let you quickly jump to any date.

The dropdown box on the right allows you to see a different view of the calendar, such as an agenda or a termly view.

If this calendar has tags, you can use the labelled checkboxes at the top of the page to select just the tags you wish to view, and then click "Show selected". The calendar will be redisplayed with just the events related to these tags, making it easier to find what you're looking for.

 
Thu 17 Jan, '19
-
CRiSM Seminar
MSB2.23

Prof. Galin Jones, School of Statistics, University of Minnesota (14:00-15:00)

Bayesian Spatiotemporal Modeling Using Hierarchical Spatial Priors, with Applications to Functional Magnetic Resonance Imaging

We propose a spatiotemporal Bayesian variable selection model for detecting activation in functional magnetic resonance imaging (fMRI) settings. Following recent research in this area, we use binary indicator variables for classifying active voxels. We assume that the spatial dependence in the images can be accommodated by applying an areal model to parcels of voxels. The use of parcellation and a spatial hierarchical prior (instead of the popular Ising prior) results in a posterior distribution amenable to exploration with an efficient Markov chain Monte Carlo (MCMC) algorithm. We study the properties of our approach by applying it to simulated data and an fMRI data set.

Dr. Flavio Goncalves, Universidade Federal de Minas Gerais, Brazil (15:00-16:00).

Exact Bayesian inference in spatiotemporal Cox processes driven by multivariate Gaussian processes

In this talk we present a novel inference methodology to perform Bayesian inference for spatiotemporal Cox processes where the intensity function depends on a multivariate Gaussian process. Dynamic Gaussian processes are introduced to allow for evolution of the intensity function over discrete time. The novelty of the method lies on the fact that no discretisation error is involved despite the non-tractability of the likelihood function and infinite dimensionality of the problem. The method is based on a Markov chain Monte Carlo algorithm that samples from the joint posterior distribution of the parameters and latent variables of the model. The models are defined in a general and flexible way but they are amenable to direct sampling from the relevant distributions, due to careful characterisation of its components. The models also allow for the inclusion of regression covariates and/or temporal components to explain the variability of the intensity function. These components may be subject to relevant interaction with space and/or time. Real and simulated examples illustrate the methodology, followed by concluding remarks.

Thu 31 Jan, '19
-
CRiSM Seminar
MSB2.23

Professor Paul Fearnhead, Lancaster University - 14:00-1500

Efficient Approaches to Changepoint Problems with Dependence Across Segments

Changepoint detection is an increasingly important problem across a range of applications. It is most commonly encountered when analysing time-series data, where changepoints correspond to points in time where some feature of the data, for example its mean, changes abruptly. Often there are important computational constraints when analysing such data, with the number of data sequences and their lengths meaning that only very efficient methods for detecting changepoints are practically feasible.

A natural way of estimating the number and location of changepoints is to minimise a cost that trades-off a measure of fit to the data with the number of changepoints fitted. There are now some efficient algorithms that can exactly solve the resulting optimisation problem, but they are only applicable in situations where there is no dependence of the mean of the data across segments. Using such methods can lead to a loss of statistical efficiency in situations where e.g. it is known that the change in mean must be positive.

This talk will present a new class of efficient algorithms that can exactly minimise our cost whilst imposing certain constraints on the relationship of the mean before and after a change. These algorithms have links to recursions that are seen for discrete-state hidden Markov Models, and within sequential Monte Carlo. We demonstrate the usefulness of these algorithms on problems such as detecting spikes in calcium imaging data. Our algorithm can analyse data of length 100,000 in less than a second, and has been used by the Allen Brain Institute to analyse the spike patterns of over 60,000 neurons.

(This is joint work with Toby Hocking, Sean Jewell, Guillem Rigaill and Daniela Witten.)

Dr. Sandipan Roy, Department of Mathematical Science, University of Bath (15:00-16:00)

Network Heterogeneity and Strength of Connections

Abstract: Detecting strength of connection in a network is a fundamental problem in understanding the relationship among individuals. Often it is more important to understand how strongly the two individuals are connected rather than the mere presence/absence of the edge. This paper introduces a new concept of strength of connection in a network through a nonparameteric object called “Grafield”. “Grafield” is a piece-wise constant bi-variate kernel function that compactly represents the affinity or strength of ties (or interactions) between every pair of vertices in the graph. We estimate the “Grafield” function through a spectral analysis of the Laplacian matrix followed by a hard thresholding (Gavish & Donoho, 2014) of the singular values. Our estimation methodology is valid for asymmetric directed network also. As a by product we get an efficient procedure for edge probability matrix estimation as well. We validate our proposed approach with several synthetic experiments and compare with existing algorithms for edge probability matrix estimation. We also apply our proposed approach to three real datasets- understanding the strength of connection in (a) a social messaging network, (b) a network of political parties in US senate and (c) a neural network of neurons and synapses in C. elegans, a type of worm.

Thu 14 Feb, '19
-
CRiSM Seminar
MSB2.23

Philipp Hermann, Institute of Applied Statistics, Johannes Kepler University Linz, Austria

Time: 14:00-15:00

LDJump: Estimating Variable Recombination Rates from Population Genetic Data

Recombination is a process during meiosis which starts with the formation of DNA double-strand breaks and results in an exchange of genetic material between homologous chromosomes. In many species, recombination is concentrated in narrow regions known as hotspots, flanked by large zones with low recombination. As recombination plays an important role in evolution, its estimation and the identification of hotspot positions is of considerable interest. In this talk we introduce LDJump, our method to estimate local population recombination rates with relevant summary statistics as explanatory variables in a regression model. More precisely, we divide the DNA sequence into small segments and estimate the recombination rate per segment via the regression model. In order to obtain change-points in recombination we apply a frequentist segmentation method. This approach controls a type I error and provides confidence bands for the estimator. Overall LDJump identifies hotspots at high accuracy under different levels of genetic diversity as well as demography and is computationally fast even for genomic regions spanning many megabases. We will present a practical application of LDJump on a region of the human chromosome 21 and compare our estimated population recombination rates with experimentally measured recombination events.

(joint work with Andreas Futschik, Irene Tiemann-Boege, and Angelika Heissl)

Professor Dr. Ingo Scholtes, Data Analytics Group, University of Zürich

Time: 15:00-16:00

Optimal Higher-Order Network Analytics for Time Series Data

Network-based data analysis techniques such as graph mining, social network analysis, link prediction and clustering are an important foundation for data science applications in computer science, computational social science, economics and bioinformatics. They help us to detect patterns in large corpora of data that capture relations between genes, brain regions, species, humans, documents, or financial institutions. While this potential of the network perspective is undisputed, advances in data sensing and collection increasingly provide us with high-dimensional, temporal, and noisy data on real systems. The complex characteristics of such data sources pose fundamental challenges for network analytics. They question the validity of network abstractions of complex systems and pose a threat for interdisciplinary applications of data analytics and machine learning.

To address these challenges, I introduce a graphical modelling framework that accounts for the complex characteristics of real-world data on complex systems. I demonstrate this approach in time series data on technical, biological, and social systems. Current methods to analyze the topology of such systems discard information on the timing and ordering of interactions, which however determines which elements of a system can influence each other via paths. To solve this issue, I introduce a modelling framework that (i) generalises standard network representations towards multi-order graphical models for causal paths, and (ii) uses statistical learning to achieve an optimal balance between explanatory power and model complexity. The framework advances the theoretical foundation of data science and sheds light on the important question when network representations of time series data are justified. It is the basis for a new generation of data analytics and machine learning techniques that account both for temporal and topological characteristics in real-world data.

Thu 28 Feb, '19
-
CRiSM Seminar
MSB2.23

Prof. Isham Valerie, Statistical Science, University College London, UK (15:00-16:00)

Stochastic Epidemic Models: Approximations, structured populations and networks

Abstract: Epidemic models are developed as a means of gaining understanding about the dynamics of the spread of infection (human and animal pathogens, computer viruses etc.) and of rumours and other information. This understanding can then inform control measures to limit, or in some cases enhance, spread. Towards this goal, I will start from some simple stochastic transmission models, and describe some Gaussian approximations and their use for inference, illustrating this with data from a norovirus outbreak as well as from simulations. I will then discuss ways of incorporating population structure via metapopulations and networks, and the effects of network structure on epidemic spread. Finally I will briefly consider the extension to explicitly spatial mobile networks, as for example when computer viruses spread via short-range wireless or bluetooth connections.

Thu 14 Mar, '19
-
CRiSM Seminar
A1.01

Speaker: Spencer Wheatley, ETH Zurich, Switzerland

Title: The "endo-exo" problem in financial market price fluctuations, & the ARMA point process

The "endo-exo" problem -- i.e., decomposing system activity into exogenous and endogenous parts -- lies at the heart of statistical identification in many fields of science. E.g., consider the problem of determining if an earthquake is a mainshock or aftershock, or if a surge in the popularity of a youtube video is because it is "going viral", or simply due to high activity across the platform. Solution of this problem is often plagued by spurious inference (namely false strong interaction) due to neglect of trends, shocks and shifts in the data. The predominant point process model for endo-exo analysis in the field of quantitative finance is the Hawkes process. A comparison of this field with the relatively mature fields of econometrics and time series identifies the need to more rigorously control for trends and shocks. Doing so allows us to test the hypothesis that the market is "critical" -- analogous to a unit root test commonly done in economic time series -- and challenge earlier results. Continuing "lessons learned" from the time series field, it is argued that the Hawkes point process is analogous to integer valued AR time series. Following this analogy, we introduce the ARMA point process, which flexibly combines exo background activity (Poisson), shot-noise bursty dynamics, and self-exciting (Hawkes) endogenous activity. We illustrate a connection to ARMA time series models, as well as derive an MCEM (Monte Carlo Expectation Maximization) algorithm to enable MLE of this process, and assess consistency by simulation study. Remaining challenges in estimation and model selection as well as possible solutions are discussed.

 

[1] Wheatley, S., Wehrli, A., and Sornette, D. "The endo-exo problem in high frequency financial price fluctuations and rejecting criticality". To appear in Quantitative Finance (2018). https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3239443

[2] Wheatley, S., Schatz, M., and Sornette, D. "The ARMA Point Process and its Estimation." arXiv preprint arXiv:1806.09948 (2018).

Wed 20 Mar, '19
-
CRiSM Day
MS.01
Wed 27 Mar, '19
-
CRiSM Seminar
MSB2.23

Daniel Rudolf, Institute for Mathematical Stochastics, Georg-August-Universität Göttingen

Title: Quantitative spectral gap estimate and Wasserstein contraction of simple slice sampling

Abstract: By proving Wasserstein contraction of simple slice sampling for approximate sampling of distributions determined by log-concave rotational invariant unnormalized densities we derive an explicit quantitative lower bound of the spectral gap. In particular, the lower bound of the spectral gap carries over to more general distributions depending only on the volume of the (super-)level sets of the unnormalized density.

Thu 2 May, '19
-
CRiSM Seminar
A1.01

Speaker: Dr. Ben Calderhead, Department of Mathematics, Imperial College London
Title: Quasi Markov Chain Monte Carlo Methods

Abstract: Quasi-Monte Carlo (QMC) methods for estimating integrals are attractive since the resulting estimators typically converge at a faster rate than pseudo-random Monte Carlo. However, they can be difficult to set up on arbitrary posterior densities within the Bayesian framework, in particular for inverse problems. We introduce a general parallel Markov chain Monte Carlo(MCMC) framework, for which we prove a law of large numbers and a central limit theorem. In that context, non-reversible transitions are investigated. We then extend this approach to the use of adaptive kernels and state conditions, under which ergodicity holds. As a further extension, an importance sampling estimator is derived, for which asymptotic unbiasedness is proven. We consider the use of completely uniformly distributed (CUD) numbers within the above mentioned algorithms, which leads to a general parallel quasi-MCMC (QMCMC) methodology. We prove consistency of the resulting estimators and demonstrate numerically that this approach scales close to n^{-2} as we increase parallelisation, instead of the usual n^{-1} that is typical of standard MCMC algorithms. In practical statistical models we observe multiple orders of magnitude improvement compared with pseudo-random methods.

Mon 13 May, '19
-
CRiSM Seminar
MB0.07

Prof. Renauld Lambiote, University of Oxford, UK (15:00-16:00)

Higher-Order Networks:

Network science provides powerful analytical and computational methods to describe the behaviour of complex systems. From a networks viewpoint, the system is seen as a collection of elements interacting through pairwise connections. Canonical examples include social networks, neuronal networks or the Web. Importantly, elements often interact directly with a relatively small number of other elements, while they may influence large parts of the system indirectly via chains of direct interactions. In other words, networks allow for a sparse architecture together with global connectivity. Compared with mean-field approaches, network models often have greater explanatory power because they account for the non-random topologies of real-life systems. However, new forms of high-dimensional and time-resolved data have now also shed light on the limitations of these models. In this talk, I will review recent advances in the development of higher-order network models, which account for different types of higher-order dependencies in complex data. Those include temporal networks, where the network is itself a dynamical entity and higher-order Markov models, where chains of interactions are more than a combination of links.

Thu 30 May, '19
-
CRiSM Seminar
A1.01

Dr. Yoav Zemel, University of Göttingen, Germany (15:00-16:00)

Title: Procrustes Metrics on Covariance Operators and Optimal Transportation of Gaussian Processes

Abstract: Covariance operators are fundamental in functional data analysis, providing the canonical means to analyse functional variation via the celebrated Karhunen-Loève expansion. These operators may themselves be subject to variation, for instance in contexts where multiple functional populations are to be compared. Statistical techniques to analyse such variation are intimately linked with the choice of metric on covariance operators, and the intrinsic infinite-dimensionality of these operators. We describe the manifold-like geometry of the space of trace-class infinite-dimensional covariance operators and associated key statistical properties, under the recently proposed infinite-dimensional version of the Procrustes metric (Pigoli et al. Biometrika 101, 409–422, 2014). We identify this space with that of centred Gaussian processes equipped with the Wasserstein metric of optimal transportation. The identification allows us to provide a detailed description of those aspects of this manifold-like geometry that are important in terms of statistical inference; to establish key properties of the Fréchet mean of a random sample of covariances; and to define generative models that are canonical for such metrics and link with the problem of registration of warped functional data.

Thu 13 Jun, '19
-
CRiSM Seminar
MSB2.22

Prof. Karla Hemming, University of Birmingham, UK (15:00-16:00)

Speaker: Clair Barnes, University College London, UK

Death & the Spider: postprocessing multi-ensemble weather forecasts with uncertainty quantification

Ensemble weather forecasts often under-represent uncertainty, leading to overconfidence in their predictions. Multi-model forecasts combining several individual ensembles have been shown to display greater skill than single-ensemble forecasts in predicting temperatures, but tend to retain some bias in their joint predictions. Established postprocessing techniques are able to correct bias and calibration issues in univariate forecasts, but are generally not designed to handle multivariate forecasts (of several variables or at several locations, say) without separate specification of the structure of the inter-variable dependence.

We propose a flexible multivariate Bayesian postprocessing framework, developed around a directed acyclic graph representing the relationships between the ensembles and the observed weather. The posterior forecast is inferred from the ensemble forecasts and an estimate of their shared discrepancy, which is obtained from a collection of past forecast-observation pairs. The approach is illustrated with an application to forecasts of UK surface temperatures during the winter period from 2007-2013.

Speaker: Karla Hemming, University of Birmingham (1500-1600)

The I-squared-CRT statistic to describe treatment effect heterogeneity in cluster randomized trials.

K Hemming (Birmingham) and A Forbes (Monash)

Treatment effect heterogeneity is commonly investigated in meta-analyses of treatment effects across different studies. The effect of a treatment might also vary across clusters in a cluster randomized trial, and it can be of interest to explore this at the analysis stage. In stepped-wedge designs and other cluster randomized designs, in which clusters are exposed to both treatment and control conditions, this treatment effect heterogeneity can be identified. When conducting a meta-analysis it is common to describe the magnitude of any treatment effect heterogeneity using the I-squared statistic, which is an intuitive and easily understood concept. Here we derive and evaluate a comparable measure of the description of the magnitude of heterogeneity in treatment effects across clusters in cluster randomized trials, the I-squared-CRT.

Tue 25 Jun, '19
-
CRiSM Seminar
MS.05

Prof. Malgorzata Bogdan, University of Wroclaw, Poland (15:00-16:00)

Abstract: Sorted L-One Penalized Estimator is a relatively new convex optimization procedure for identifying predictors in large data bases.
In this lecture we will present the method, some theoretical and empirical results illustrating its properties and the applications in the context of genomic and medical data. Apart from the classical version of SLOPE we will also discuss its spike and slab version, aimed at reducing the bias of estimators of regression coefficients. When discussing SLOPE we will also present some new theoretical results on the probability of discovering the true model by LASSO (which is a specific instance of
SLOPE) and its thresholded version.

Fri 28 Jun, '19
-
CRiSM Seminar
MB2.23

Dr. Pauline O'Shaughnessy, University of Wollongong, Australia

Title: Bootstrap inference in the longitudinal data with multiple sources of variation

Abstract: Linear mixed models allow us to model the dependence among the responses by incorporating random effects. Such dependence inherent in the longitudinal data from a complex design can be from the clustering between subjects and the repeated measurements within the subject. When the underlying distribution is not fully specified, we consider a class of estimators defined by the Gaussian quasi-likelihood for normal-like response variable. Historically it is challenging to make inference about the variance components in the framework of mixed models. We propose a new weighted estimating equation bootstrap, which varies weight schemes for different parameter estimators. The performance of the weighted estimating equation bootstrap is empirically evaluated in the simulation studies, showing improved coverage and variance estimation for the variance component estimators under models with normal and non-normal distributions for random effects. The asymptotic properties will also be addressed and we apply this new bootstrap method to a longitudinal dataset in biology.

(This is a joint work with Professor Alan Welsh from the Australian National University.)

Thu 24 Oct, '19
-
CRiSM Seminar
MB0.07

Localizing Changes in High-Dimensional Vector Autoregressive Processes

Thu 7 Nov, '19
-
CRiSM Seminar: High-dimensional principal component analysis with heterogeneous missingness
MB0.07
Thu 21 Nov, '19
-
CRiSM Seminar - Modelling Networks and Network Populations via Graph Distances
MB0.07 Mathematical Sciences Building

Speaker: Sofia Olhede

Thu 5 Dec, '19
-
CRiSM Seminar
MB0.07
Wed 15 Jan, '20
-
CRiSM Seminar - Deep learning in genomics, and a topic model for single cell analysis - Gerton Lunter
MB0.07
Wed 29 Jan, '20
-
CRiSM Seminar - Modelling spatially correlated binary data, Professor Jianxin Pan
MB0.07
Wed 12 Feb, '20
-
CRiSM Seminar - Model Property-Based and Structure-Preserving ABC for complex stochastic models
MB0.07
Wed 26 Feb, '20
-
CRiSM Seminar - Sequential learning via a combined reinforcement learning and data assimilation ansatz for decision support
MB0.07
Wed 4 Mar, '20
CRiSM Seminar - Scaling Optimal Transport for High dimensional Learning
MB0.07 Mathematical Sciences Building

Speaker: Gabriel Peyré, CNRS and Ecole Normale Supérieure

Thu 30 Apr, '20
-
CRiSM Seminar - Simon French
Online
Thu 14 May, '20
-
CRiSM Seminar - Jane Hutton: I know I don't know: Covid-19 patients' journeys through hospital
Online

I was asked to consider the available data on Covid-19 patients' path into hospital, and then to intensive care, or death, or transfer or discharge. Of course, once in intensive care, patients can move to the states death, discharge home, discharge to nursing home, discharge to hospital ward. I was invited by those who think I know about analysis of times to events with messy data. The data is messy, and there are other challenges.

I benefited from conversations with medical friends and colleagues, particularly a respiratory physician.

Depending on permissions, I will either illustrate issues with artificial data, or present actual results.

Thu 11 Jun, '20
-
CRiSM Seminar - Olivier Renaud
Online
Thu 25 Jun, '20
-
CRiSM Seminar
MB0.08
Wed 28 Oct, '20
-
CRiSM Seminar
via Teams
Thu 12 Nov, '20
-
CRiSM Seminar
via Teams
Thu 26 Nov, '20
-
CRiSM Seminar
via Teams
Thu 10 Dec, '20
-
CRiSM Seminar
via Teams

Placeholder