# Christopher Pettitt

Master of Mathematics, Warwick 2010

Master of Mathematics and Statistics, Warwick 2011

MASDOC PhD Student, Warwick 2011-present

Email : c.a.mylastname ~t warwick.ac.uk

Office: B3.04

Research interests: Statistical inference on epidemic processes incorporating population genetic models

### At The Moment

I am currently in my third PhD year with the MASDOC doctoral training centre, researching inference on epidemics with multiple distinct pathogen strains.

### Research Summary

Epidemiology: General SIR models, in which all individuals in a population are classified as (S)usceptible to infection, (I)nfective or (R)ecovered (neither S nor I), are the basis for many epidemiological studies. Common quantities of interest are the basic reproduction number R0, which determines whether or not the epidemic will go extinct (R0 less than one) or not (R0 greater than one) and the final size of the epidemic (the total number infected). Knowledge of these and related parameters allows control measures (such as vaccination, quarantines, culling) to be considered in ongoing or future epidemics.

Inference: Inference can be difficult because most of the epidemic is unobserved: for instance, infection occurs before the symptoms become apparent, and many individuals may not report having the disease or recovering. Inference can be treated as a missing data problem, whereby a Bayesian MCMC algorithm is used to sample from the marginal distributions of each unknown parameter given the data and current estimates for the parameters and unobserved data. By sequentially sampling each parameter, a Markov chain is constructed whose stationary distribution is the joint distribution of all parameters and missing data.

Population Genetics: In large populations, the genetic evolution of a sample of individuals can be well approximated by certain coalescent models, which consider the lineages backwards in time and join them whenever a reproduction event occured. Coalescent models can be used to model epidemics when genetic samples of the pathogen population are available, since infections then correspond to movements of pathogen subpopulations between hosts. This model informs the inference procedure and can be used in conjunction with the standard process model to improve likelihood-based parameter estimates.

### Activity

25 July 2013 - Talk, 'Inference of epidemics with multiple-strain data' in Auckland University, NZ.

June - July 2013: Collaboration with Dr. Chris Jewell at Massey University, Palmerston North, NZ.

5 December 2012: Talk, 'Inference on epidemic processes incorporating pathogen strain information', at the WIDER seminar.

May 2012: MASDOC summer retreat in Derbyshire.

26-28 March 2012: STOR-i problem solving days in Lancaster.

20 February 2012: Talk, 'Generating pseudorandom numbers', at the MASDOC seminar.

January -September 2012: APTS statistical training course - one weeks in each of Cambridge, Nottingham, Warwick, Glasgow

18 October 2011: Talk, 'Importance sampling on genetic ancestries modelled with the Kingman coalescent', at the MASDOC seminar.

August 2011: I submitted my MSc dissertation on computational methods in population genetics, supervised by Dr Dario Spanò.

March 2011: InFER2011 (Inference for Epidemic-Related Risk) conference at Warwick University.

### MSc Dissertation

I review importance sampling methods for the inference of mutation rates on present-time population genetic data. The genetic ancestors to the present-time population are modelled using the Kingman coalescent, which is the limiting case of many catgeories of exchangeable Cannings models including the discrete-time Fischer-Wright model and the continuous-time Moran model. The Kingman coalescent models the ancestry of the present-time sample backwards in (continuous) time via coalescence events (combining lineages) and mutations (changing the genetic type of a lineage).

The Kingman coalescent model provides a probability distribution function over all genetic ancestries for which a direct sampling method is not known. A likelihood function for the mutation rate is expressed as a missing data integral over the set of ancestries corresponding to the present-time sample, and importance sampling is used to evaluate this integral via proposal distributions developed by Stephens and Donnelly and the subsequent work of Griffiths and De Iorio.

Following the review, I implement the methods on a Y-chromosome data set on which MCMC Gibbs sampling techniques have previously been used.

### MA916 - Research Study Groups

I worked with Don Praveen Amarasinghe, Andrew Aylwin and Pravin Madhavan on the Biomembranes project.

Supervisors: Prof. Charles Elliott and Dr. Björn Stinner.

### AOB

I sing tenor in the University of Warwick Chamber Choir and have been involved in most of Opera Warwick's productions, most recently The Marriage of Figaro and Alice in Wonderland, and currently Don Giovanni. I also play the piano, race triathlons and try to find time to improve my languages - mainly German and Japanese, although currently Italian plus a little Mandarin.