Methodological Developments
Underpinning our medical and biological work is the development of new methodologies for statistics, mathematics, and computer science. The examples below are illustrative of the type of high impact work we undertake in developing new methodologies.
Data Analysis
One of the most fundamental issues is to extract meaning from complex biological data. Work in the Zeeman Institute has pioneered multiple areas including:
Quantitative image analysis which has lead to the development of three open-source software packages: QuimP for correlating cortical cell fluorescence with membrane movements; LineageTracker a multi feature cell tracker; and CellTracker which is specifically dedicated to measuring periodic nuclear-cytoplasmic translocations of transcription factors.
Bioinformatics, in particular in the application of Bayesian statistical machine learning techniques to problems in systems biology, functional genomics and proteomics. This work uses genome sequences and other forms of high-throughput data to understand fundamental biological processes, but the wealth of data generated requires the development of sophisticated mathematical and computational tools.
Biological and Epidemiological data is often confounded by noise and biases in detection; bespoke statistical techniques are often required to extract the underlying signals.
Machine Learning
Machine Learning is an example of a statistical method that aims to extract an informative signal from complex (high-dimensional) and often noisey data. We have used this technique in three main settings.
Early cancer detection is essentially a prediction task, on the basis of the available data. We are involved in a project to develop a universal blood test for cancer, by integrating hundreds of different kinds of blood-based cancer biomarkers using machine learning algorithms.
Electronic noses are devices which can measure a profile of volatile organic compounds (VOCs), which are produced by the body and vary in response to disease, thus giving a distinctive 'smell' that characterises that disease. The data are highly complex, and we are working to develop better ways to extract the structure, leading to improved ability to correctly diagnose the presence or absence of a given disease.
Data integration is a key task in modern medical research, due to the increasing ease with which multiple modalities can be measured. We have for a number of years now been developing methods which provide a principled statistical framework for integrating highly heterogeneous data sources.
Model Development
Matching Models and Data
It is often said that "models are only as good as the data that supports them"; and this is a view that is echoed in all of the work at the Zeeman Institute. Much of our research involves the matching of complex models to complex data. In particular, we often use sophisticated Bayesian (MCMC) techniques to infer parameter values for our models. We are also interested in how parameter sensitivity translates into qualitative changes in model behaviour, espeically in relation to policy predictions.