Events
CRiSM Seminar - Yee Whye Teh
Yee Whye Teh (Gatsby Computational Neuroscience Unit, UCL)
A Bayesian nonparametric model for genetic variations based on fragmentation-coagulation processes
Hudson's coalescent with recombination (aka ancestral recombination
graph (ARG)) is a well accepted model of genetic variation in
populations. With growing amounts of population genetics data, demand
for probabilistic models to analyse such data is strong, and the ARG
is a very natural candidate. Unfortunately posterior inference in the
ARG is intractable, and a number of approximations and alternatives
have been proposed. A popular class of alternatives are based on
hidden Markov models (HMMs), which can be understood as approximating
the tree-structured genealogies at each point of the chromosome with a
partition of the observed haplotypes. However due to the way HMMs
parametrize partitions using latent states, they suffer from
significant label-switching issues affecting the quality of posterior
inferences.
We propose a novel Bayesian nonparametric model for genetic variations
based on Markov processes over partitions called
fragmentation-coagulation processes. In addition to some interesting
properties, our model does not suffer from the label-switching issues
of HMMs. We derive an efficient Gibbs sampler for the model and report
results on genotype imputation.
Joint work with Charles Blundell and Lloyd Elliott