Skip to main content

Machine Learning Stratification of Patient Response to Chemotherapy in Ovarian Cancer

2013 - ongoing

My PhD focuses on the prediction of chemoresistance in ovarian cancer.

There are two key mechanisms by which this may occur: muations or altered gene expression. I am focusing on gene expression, and investigating whether this may be used to predict patient response to treatment.

The PhD will involve both laboratory and computational work.

Laboratory work

A data set consisting of gene expression measurements taken from ovarian cancer patients is being generated.

Following mRNA extraction from FFPE biopsy tissue and reverse transcription, gene expression measurements are made using a custom TaqMan microfluidic array card, which measures 97 genes per patient.

Clinical covariates will be accessed through University Hospital of Coventry and Warwickshire (UHCW).

This lab work is being carried out in the molecuar Pathology labs at UHCW.

Computational work

Machine learning techniques are models capable of pattern recognition. Here, I will be using Gaussian processes to model the relationships between genes and predict patient outcome.

Gaussian processes place a prior on the space of all functions relating features to outcome, restricting qualities such as smoothness and stationarity. Given the data, they provide a posterior distribution on the possible functions, allowing predictions to be made about the properties of the latent function underlying the data. In this way, predictions of the mean and variation at any point may be made.

R was chosen as a suitable language for this work, and so the first step was to implement Gaussian process regression in R.

In order for the Gaussian process models to correctly interpret the survival data, which contains right-censored survival times, additional functionality must be developed. Here, the right-censored times are considered to be missing, with the censored times forming a lower bound on the value of the true, unmeasured survival time. By inferring new survival times for the censored samples, the underlying, uncensored data set may be estimated, allowing predictions to made as in Gaussian process regression. The developement of this model is currently underway.

Systematic Review

A systematic review is being carried out to investigate the literature concerning the prediction of ovarian cacner patient response to chemotherapy using gene expression measurements.