Principal Supervisor: Dr Jiarui Zhou
Secondary Supervisor(s): Dr Shan He
University of Registration: University of Birmingham
BBSRC Research Themes:
Chemical pollution deteriorates the environment and causes adverse health effects on animals and humans. Traditionally, assessments of toxic effects have relied on animal testing, which is crude and expensive. The high-throughput omics technologies, such as transcriptomics and metabolomics, allow for rapid and scalable analysis of biological samples. The biomolecular profiles (multi-omics) provide comprehensive measurements of chemical impacts. They are information-rich and well-suited for being integrated into in-silico models, which can effectively predict toxicity while reducing reliance on animal testing. However, the inherent complexity of multi-omics data poses substantial challenges in its computational modelling. Moreover, multi-omics data is characterised by a small sample size and a large number of features (e.g. genes and metabolites), with a lack of standardisation of data acquisition and processing methodologies across studies. Conventional approaches rely on simple algorithms and data from a single study to construct in-silico models, often resulting in poor performance and an increased risk of overfitting.
A digital twin is a virtual replica that integrates diverse, hierarchically structured biomolecular data to simulate complex biological systems, allowing for more accurate predictions of multiple toxicity endpoints simultaneously. It serves as a unified computational framework for harmonising data across projects into a standardised digital representation, cumulating knowledge from new data to refine internal models, improving their accuracy and predictive capabilities over time. It is a revolutionary approach to ecotoxicology research. The aim of this project is to utilise novel artificial intelligence, particularly graph neural networks to systematically model multi-omics data, thereby creating a digital twin to accurately predict environmental chemical toxicity, thereby reducing the reliance on animal testing. This project will contribute to the advancement of AI and digital twin technologies for ecotoxicology, adhering to the 3Rs principles (Replacement, Reduction, Refinement) for a safer future. It aligns with BBSRC’s priority area of Sustainable Agriculture and Food: Animal Health and Welfare.
Particularly, the PhD student will focus on three research objectives:
Objective 1. Develop data augmentation and feature misalignment alleviation algorithms to optimise AI model training on multiple batches of small datasets.
Objective 2. Develop graph neural networks for multi-omics data modelling to predict chemical toxicity endpoints.
Objective 3. Apply the algorithms to ecotoxicological multi-omics data to create a digital twin model.
This PhD project synergies with the EU H2020 project PrecisionTox, which is co-led by the primary supervisor at the University of Birmingham. The PhD student will focus on multi-omics data from Daphnia magna, a non-sentient organism widely used in ecotoxicology research, with the scope potentially broadening to include the other four model organisms utilised in the PrecisionTox project. The digital twin’s development will be based on the pilot computational studies in PrecisionTox.
The PhD student will work at the interface between bioscience (primary supervisor) and computer science (secondary supervisor), acquiring a unique multidisciplinary profile with expertise in AI, systems biology, and environmental science. The student will also be embedded in two large pan-European initiatives, the Horizon 2020 consortium PrecisionTox and the Horizon Europe PARC. These initiatives provide an extensive multidisciplinary network of experts in diverse fields.
- Björnsson, Bergthor, et al. "Digital twins to personalize medicine." Genome medicine 12 (2020): 1-4.
- Wu, Zonghan, et al. "A comprehensive survey on graph neural networks." IEEE transactions on neural networks and learning systems 32.1 (2020): 4-24.
- Li, Xiao, et al. "MoGCN: a multi-omics integration method based on graph convolutional network for cancer subtype analysis." Frontiers in Genetics 13 (2022): 806842.
- RNA-seq data processing
- LC-MS data processing
- Multi-omics integration modelling
- Multi-modal learning
- Graph embedding
- Generative adversarial network
- Graph neural networks