# Machine Learning Reading Group

This webpage is old & out of date. The MLRG is now run by Iliana Peneva and can be found here.

We are an informal group, meeting fortnightly to discuss all things Machine Learning. Feel free to come along! We meet at 4pm Monday in D1.07. Free biscuits are provided :)

Our format is a single presenter per meeting who chooses the topic. No preparatory reading is required.

The Deep Learning Reading Group is a sister group dedicated to deep learning, running on alternate weeks to the ML group.

Give a talk! We would love a talk from anybody; email me if you're thinking about it! Talk about your own research, some core methodology, an interesting paper, or whatever else you like. If you're not sure if a topic is suitable just ask :)

## Next Meeting

- None :(

## Upcoming Meetings

- None :(

## Past Meetings

- 10/7/2017, Jim Skinner

**Structured PCA**

In p>>n, performing PCA to recover the Principal Pomponents (PCs) of the data covariance can be statistically challenging. PCA traditionally returns a point estimate of the PCs, which can be misleading in the hugh uncertainty case. If p>>n with very noisy data, the recovered PCs can be orthogonal to the recovered PCs.

I will present a hierarchical model that may be used to estimate PCs, but allows a prior to be specified (with optional hyperparameters, which can be tuned). The specification of prior knowledge can improving the accuracy of the estimated PCs. In addition an approximate posterior over PCs is returned, allowing uncertainty to be quantified.

R package: https://github.com/JimSkinner/stpca

- 26/6/2017, Sebastien Raguideau

**Non-negative Matrix Factorisation**

Non-negative Matix Factorisation is a model used both for modeling and clustering purpose. In NMF, we try to reconstruct a set of samples or observation, as generated from a non-negative mixture of unknown sources. The point of this method is thus to infer both sources compositions and their proportions in each samples.

In this talk I will present how such a problem can be set up and resolved. I will also address a few questions such as identifiability, comparison with SVD and bayesian approach. Finally I will be talking about my current research interest, namely path factorisation using the framework of NMF.

- 12/6/2017, Matt Moores

**Approximate Bayesian computation for the Ising/Potts model**

Bayes’ formula involves the likelihood function, p(y|theta), which is a problem when the likelihood is unavailable in closed form. ABC is a method for approximating the posterior p(theta|y) without evaluating the likelihood. Instead, pseudo-data is simulated from a generative model and compared with the observations. This talk will give an introduction to ABC algorithms: rejection sampling, ABC-MCMC and ABC-SMC. Application of these algorithms to image analysis will be presented as an illustrative example. These methods have been implemented in the R package bayesImageS.

This is joint work with Christian Robert (Warwick/Dauphine), Kerrie Mengersen and Christopher Drovandi (QUT).- Grelaud, A.; Robert, C. P.; Marin, J.-M.; Rodolphe, F. & Taly, J.-F. (2009)

“ABC likelihood-free methods for model choice in Gibbs random fields”

Bayesian Analysis 4(2): 317-336. http://dx.doi.org/10.1214/09-BA412 - Moores, M. T.; Drovandi, C. C.; Mengersen, K. & Robert, C. P. (2015)

“Pre-processing for approximate Bayesian computation in image analysis”

Statistics & Computing 25(1): 23-33. http://dx.doi.org/10.1007/s11222-014-9525-6

- 15/5/2017, Leonidas Souliotis

**Stochastic Block Models for Network Classification**

Stochastic blockmodels (SBMs) are generative models for blocks, groups or communities in networks. Used widely, but not restricted, in social sciences, these kind of models help scientists model phenomena which the objects of interest are represented as vertices in directed or undirected graphs. In the context of this talk, we will consider the simplest models, where each of the N vertices is assigned one of K communities, groups of blocks. I will talk for both the standard SBM and the degree-correlated version, in which we add an extra parameter controlling the expected degree of each vertex.

- 3/4/2017, Iliana Peneva

**Nonlocal priors**Choosing the number of components remains a challenge in mixture modelling. Traditional model selection criteria such as BIC and Bayes factors, often lead to poorly separated components and to lack of parsimony. Non-local priors are a family of distributions which enforce separation between the models under consideration. In this talk, I will introduce non-local priors and some of the default priors. I will show how they lead to extra parsimony and present their application in the case of multivariate Normal mixture case.

- 6/3/2017, Rob Eyre

**Building a Belief Network**In social science causal modelling is an incredibly valuable but very difficult (some would say impossible) thing to do. One tool which has been presented for many years now as being more suited to causal modelling than most is belief, or Bayesian, networks. These networks present directed links amongst a set of variables accompanied by the conditional probabilities of the states of those variables dependent on the others. We will go through all the steps of constructing a belief network for a system. From choosing the variables, to learning the structure and probabilities, and finally using the structure of the network and algorithms such as belief propagation to find out how the different variables impact upon each other. In doing so we will not only see various probabilistic algorithms, but also learn how to overcome cognitive biases and elicit knowledge from system experts.

- 20/2/2017, Ayman Boustati

**GP Models for Multitask Learning**Multi-task learning refers to a framework in machine learning where one want to learn more than one task that share a common domain. In this talk, I will explain how to incorporate this framework within Bayesian modeling using Gaussian processes. I will explain the general structure of a multi-output GP, and explore some of the most common kernel structures that correspond to such models.

- 6/2/2017, Matt Neal
**Variational Bayesian Inference: For when you look at Gibbs sampling and think “Well, that’s just too easy”**Variational Bayesian methods allow one to derive analytic approximations to intractable integrals arising in Bayesian inference. This talk is a beginners introduction to using variational Bayes to approximate the posterior probability of unobserved variables in a Bayesian model, as an alternative to Monte Carlo methods.

- 19/9/2016, Michael Pearce
**Gaussian Processes for General Optimisation**Many optimisation algorithms make assumptions about the objective function, function evaluations are noise-free, or the gradient can be evaluated. In cases where things are a bit messier, one can use Simulated Annealing, or alternatively Gaussian Process Regression can be used as a surrogate model for the objective function providing statistical estimates of the objective function. Using these estimates we can iteratively collect function evaluations to find the highest point as quickly as possible. I will give an introductory talk about Gaussian Process Regression and a few such "messy" optimisation methods including one that I have been working on for my PhD.

- 3/10/2016, Jim Skinner
**PCA with prior knowledge**Feature learning on noisy or high dimensional data can be tricky, and prior knowledge of the expected structure of the data may be essential to learn sensible features. I introduce an extension to Probabilistic PCA enabling prior specification and hyperparameter tuning. An R package is in development.

- 17/10/2016, Iliana Peneva
**Using Collapsed Gibbs Sampling in Mixture Models**In this talk I will introduce the latent variable representation of Mixture Models, in particular Gaussian Mixture Models, and will talk about using Collapsed Gibbs Sampling to do inference in these models. I will present an example from my current work as well to illustrate the process of inference.

- 31/10/2016, No meeting
- 14/11/2016, Matthew Neal
**A guided tour of Gaussian Process Latent Variable Model (GPLVM) extensions**

GPLVMs are a nonlinear extension of dual probabilistic PCA providing a generative latent variable model of high-dimensional data. After a brief introduction to GPLVMs we will present three extensions to GPLVM:

- Back-constrained GPLVM constrains the latent variables to be a function of the original data;

- Discriminative GPLVM regularizes GPLVM using Fisher's Linear Discriminant to identify features relevant to a classification task;

- Structured GPLVM is a novel technique for incorporating prior knowledge of underlying structure in the data.

- 28/11/2016, Alejandra Avalos

**Factor analysis and latent factor regression with batch effect adjustment**

Factor analysis is a dimensionality reduction technique which aims to describe the covariance of an observed set of variables. In this talk I will provide a general overview of the factor analysis model and its relationship with some of the most commonly used dimensionality techniques such as PCA. An extension of this model will be discussed, using factor analysis and latent factor regression model with batch effect adjustment. Finally, a motivation case study based on cancer datasets is presented and discussed.

### Time and Location:

Mondays 4-5pm

Room D1.07

Zeeman Building

### Contact

Jim Skinner (organiser) j dot r dot skinner at warwick dot ac dot uk