Skip to main content Skip to navigation

Exploring mechanisms of platelet function using large datasets employing machine learning and bioinformatics tools

Principal Supervisor: Dr Archana Sharma-Oates

Secondary Supervisor(s): Professor Neil Morgan / Professor Georgios V. Gkoutos 

University of Registration: University of Birmingham

BBSRC Research Themes:

No longer accepting applications


Project Outline

Genetic variation between individuals is what makes us unique. Alterations in the DNA code can lead to changes in the amino acid sequence which ultimately impacts the structure and function of the proteins generated.

Genetic variation has been implicated in disrupting the function of platelets; platelets are blood cells primarily responsible for clotting1. There is evidence from the literature to suggest a significant association between genetic heterogeneity within the GPCR genes and platelet function2. Furthermore, genetic variation has been linked with platelet function in primary hemostasis. Heterozygous mutations in numerous platelet proteins can cause a range of bleeding symptoms. However, thus far the correlation between genetic variation within platelet proteins and clinical and biochemical parameters has not been fully characterized. It is critical that we take a holistic analysis approach integrating these diverse datasets to unravel the relationship between genetic variants within platelet proteins and their function in blood clotting.

The Genotyping and Phenotyping of Platelets (GAPP) study has recruited participants with abnormal bleeding and platelet dysfunction of unknow cause through platelet phenotyping in combination with genome-wide and targeted gene sequencing.

Hypothesis

We propose to use an extensive bioinformatics screening approach as well as machine learning (ML) techniques to associate candidate gene variants to detailed clinical and laboratory phenotyping, to better understand the biological mechanisms underpinning platelet function.

Aims

This project aims to identify rare genetic variants in platelet GCPR genes to assess the functional impact of these rare variants on correct folding, ligand binding, and drug-induced signalling.

Experimental Methods and Research Plan

Project aims will be investigated using bioinformatic analysis of existing whole exome/genome DNA sequence data from the GAPP study and other public datasets. The discoveries made in this cohort will be verified in the UK Biobank. Machine learning methods will be used to integrate genetic variants with transcriptomic data and the associated clinical data, including platelet function testing and haematological parameters, to identify gene variants correlated with platelet function. Furthermore, graph-based methods will be used to explore the relationships between the datasets. Additionally, the project will look to predict structural protein changes based on gene mutations which change amino acid configurations and try to establish genomic function with structural changes. Disruptive pathways will be investigated.

References

  1. Kuter, et al. Overview of platelet disorders - hematology and oncology [Internet]. MSD Manual Professional Edition. MSD Manuals; 2022 [cited 2022Dec19]. Available from: https://www.msdmanuals.com/professional/hematology-and-oncology/thrombocytopenia-and-platelet-dysfunction/overview-of-platelet-disorders
  2. Michelson AD, et al. eds. Platelets. 4th ed. London, UK: Academic Press, 2019:701-6, 849-62, 877-904.

Techniques

  • R statistical programming language for statistical analyses.
  • Python programming language for machine learning methods.
  • Analysis of gene sequencing data and variant prioritisation.
  • Analysis of transcriptomics data using a bash scripts and R.
  • Various statistical and machine learning methods will be used to integrate diverse multimodal datasets to identify biomarkers for diagnostic and therapeutic intervention.