See the PhD Opportunities section to see if this project is currently open for applications via MIBTP.
Please Note: The main page lists projects via BBSRC Research Theme(s) quoted and then relevant Topic(s).
Leveraging machine learning and molecular dynamics simulations to facilitate engineering biology
Secondary Supervisor(s): Professor Long Tran-Tranh, Professor Greg Challis
University of Registration: University of Warwick
BBSRC Research Themes:
Project Outline
Bacterial natural products have wide-ranging applications in medicine and agriculture. These metabolites are assembled by complex modular enzymatic machinery, including polyketide synthases (PKSs) and nonribosomal peptide synthetases (NRPSs). The modular nature of these enzymatic machines renders them amenable to rational genetic manipulation to produce natural products and their analogues challenging for synthetic chemistry. Structural modification of natural products via bioengineering enables optimised derivatives to be produced via fermentation, which is more sustainable than chemical synthesis.
Integrated biology approach has significantly contributed to our understanding of factors such as protein-protein interactions, conformational changes and enzyme substrate specificity that are important for engineering of PKSs and NRPSs. Artificial intelligence (AI) and machine learning (ML) have recently revolutionised aspects of time-consuming structural biology workflows. For example, through reliable structure prediction AlphaFold has helped to accelerate various processes.
AI is widely used to natural product discovery but very little attention has been given so far to leveraging ML to accelerate rational engineering of PKSs and NRPSs. In this PhD project, we intend to use combination of molecular dynamics simulations and ML to speed up various aspect of this process. For example, while rule-based techniques such as antiSMASH are widely used for identifying biosynthetic gene clusters they are suboptimal in predicting domain boundaries, which is important factor in engineering workflow. We will develop structure-based ML approaches to streamline such processes. Similarly, as in the case of adenylation domains ML could aid with understanding substrate specificity for a much wider range of enzymes. From AI perspective we will use an novel approach where multiple large language model agents together tackle complex problems in a collaborative manner.