Skip to main content

Statistical Machine Learning Reading Group

During term time

Thursday 11 October (and every Thursday thereafter) 1300-1400 MSB2.22

About: Founded by Murray in autumn 2016, the reading group covered quite a number of chapters of the old core book (see Resources) till summer 2018. In 2018/19, the plan is to first learn something on Deep Gaussian Processes (DGP), and then move on to Generative adversarial networks (GANs). The first article to be read is Bayesian Gaussian Process Latent Variable Model. A good reference for what will happen later on w.r.t. DGps is Andreas Damianou's PhD thesis (see Recources).

Organiser: Sigurd Assing, Sherman Ip, Murray Pollock.

Website URL:


Mailing List Sign-Up:

Mailing List: (NB - only approved members can post)

2018/19 Term 1:

11th October. Sigurd Assing (Warwick). Introduction

Abstract. Use of Gaussian processes in Unsupervised Learning is motivated by recalling kernel methods in Supervised Learning.

18th October. Sigurd Assing (Warwick). Sections 1-3 (Bayesian Gaussian Process Latent Variable Model)

Abstract. Setting up the problem, and motivating decomposition (11), on page 846.

25th October. Sigurd Assing (Warwick). Sections 1-3 (Bayesian Gaussian Process Latent Variable Model)

Abstract. Discussing first bits of variational inference.

1st November. Cancelled.

8th November. Sigurd Assing (Warwick). Section 4 (Bayesian Gaussian Process Latent Variable Model)

Abstract. The density used for prediction, given by (19) on page 848, is a ratio of two lower bounds. To be discussed with the group how good such a ratio could be. Will repeat of few of the steps leading to these lower bounds.

15th November. Sigurd Assing (Warwick). Section 4 (Bayesian Gaussian Process Latent Variable Model)

Abstract. Since people came who had missed the previous session, I went again through the derivation of the lower bound.

22th November. Sigurd Assing (Warwick). Section 4+5 (Bayesian Gaussian Process Latent Variable Model)

Abstract. After we understood how the approximating density given by (20) on page 848 in the paper was built, it's time to look at the quality of such a density from the critical angle :-), finishing the discussion of this paper.

29th November. Joe Meagher (Warwick). Presenting Deep Gaussian Processes for Regression using Approximate Expectation Propagation

Abstract. After having spent some time on simple GP models, it is time to look into DGP. In the paper to be presented, a new approximate Bayesian learning scheme is developed. The new method will be used to evaluate non-linear regression on eleven real-world datasets, showing that it always outperforms GP regression and is almost always better than state-of-the-art deterministic and sampling-based approximate inference methods for Bayesian neural networks. As a by-product, this work provides a comprehensive analysis of six approximate Bayesian methods for training neural networks.

6th December. Henry Jia (Warwick). Presenting Autoencoding Variational Bayes

Abstract. In this paper variational Bayes is applied to auto-encoding but in a more general framework than Gaussian processes, that is, the likelihood is not just Gaussian with a covariance depending on the distribution of some latent variables. Again the variational lower bound comes into play but it is now reparametrised giving a stochastic gradient variational Bayes (SGVB) estimator. This paper is also cited several times in Damianou's thesis on Deep Gaussian processes which we "attack" next. For example, it pops up in Section 4.4 before Damianou deals with Deep Gaussian Processes, so it's nice to see more of the autoencoder stuff before reading in Damianou's thesis. Henry will also refer to the following tutorial on Variational Autoencoders which is a nice background-read anyway.