Statistical Machine Learning Reading Group

SUSPENDED TILL FURTHER NOTICE

About: Founded by Murray in autumn 2016, the reading group covered quite a number of chapters of the old core book (see Resources) till summer 2018. In 2018/19, the plan is to first learn something on Deep Gaussian Processes (DGP), and then move on to Generative adversarial networks (GANs).

Organiser: Sigurd Assing, Sherman Ip

Website URL: www.warwick.ac.uk/smlrg

Forum: www.warwick.ac.uk/smlrg/ml-seminar

Mailing List Sign-Up: http://mailman1.csv.warwick.ac.uk/mailman/listinfo/machinelearning

Mailing List: machinelearning@listserv.csv.warwick.ac.uk (NB - only approved members can post)

ACADEMIC YEAR 2018/19

21th February. Henry Jia (Warwick). Tutorial on GANS: https://arxiv.org/pdf/1701.00160.pdf

Abstract. Henry starts with this tutorial, which was given by Ian Goodfellow at NIPS in 2016. The tutorial describes: (1) Why generative modeling is a topic worth studying, (2) how generative models work, and how GANs compare to other generative models, (3) the details of how GANs work, (4) research frontiers in GANs, and (5) state-of-the-art image models that combine GANs with other methods. Finally, the tutorial contains three exercises for readers to complete, and the solutions to these exercises.

14th February. Sigurd Assing (Warwick). Section 6.2 (Andreas Damianou's PhD thesis)

Abstract. Finishing with deep Gaussian processes, discussing varutional inference across layers.

24th January. Sigurd Assing (Warwick). Section 6.2 (Andreas Damianou's PhD thesis)

Abstract. Discussing variational inference in deep Gaussian processes, first within a layer, second across layers.

17th January. Henry Jia (Warwick). Finishing Autoencoding Variational Bayes

Abstract. See session 6th of December.

6th December. Henry Jia (Warwick). Presenting Autoencoding Variational Bayes

Abstract. In this paper variational Bayes is applied to auto-encoding but in a more general framework than Gaussian processes, that is, the likelihood is not just Gaussian with a covariance depending on the distribution of some latent variables. Again the variational lower bound comes into play but it is now reparametrised giving a stochastic gradient variational Bayes (SGVB) estimator. This paper is also cited several times in Damianou's thesis on Deep Gaussian processes which we "attack" next. For example, it pops up in Section 4.4 before Damianou deals with Deep Gaussian Processes, so it's nice to see more of the autoencoder stuff before reading in Damianou's thesis. Henry will also refer to the following tutorial on Variational Autoencoders which is a nice background-read anyway.

29th November. Joe Meagher (Warwick). Presenting Deep Gaussian Processes for Regression using Approximate Expectation Propagation

Abstract. After having spent some time on simple GP models, it is time to look into DGP. In the paper to be presented, a new approximate Bayesian learning scheme is developed. The new method will be used to evaluate non-linear regression on eleven real-world datasets, showing that it always outperforms GP regression and is almost always better than state-of-the-art deterministic and sampling-based approximate inference methods for Bayesian neural networks. As a by-product, this work provides a comprehensive analysis of six approximate Bayesian methods for training neural networks.

22th November. Sigurd Assing (Warwick). Section 4+5 (Bayesian Gaussian Process Latent Variable Model)

Abstract. After we understood how the approximating density given by (20) on page 848 in the paper was built, it's time to look at the quality of such a density from the critical angle :-), finishing the discussion of this paper.

15th November. Sigurd Assing (Warwick). Section 4 (Bayesian Gaussian Process Latent Variable Model)

Abstract. Since people came who had missed the previous session, I went again through the derivation of the lower bound.

8th November. Sigurd Assing (Warwick). Section 4 (Bayesian Gaussian Process Latent Variable Model)

Abstract. The density used for prediction, given by (19) on page 848, is a ratio of two lower bounds. To be discussed with the group how good such a ratio could be. Will repeat of few of the steps leading to these lower bounds.

1st November. Cancelled.

25th October. Sigurd Assing (Warwick). Sections 1-3 (Bayesian Gaussian Process Latent Variable Model)

Abstract. Discussing first bits of variational inference.

18th October. Sigurd Assing (Warwick). Sections 1-3 (Bayesian Gaussian Process Latent Variable Model)

Abstract. Setting up the problem, and motivating decomposition (11), on page 846.

11th October. Sigurd Assing (Warwick). Introduction

Abstract. Use of Gaussian processes in Unsupervised Learning is motivated by recalling kernel methods in Supervised Learning.

Resources: