# ST329: Topics in Statistics

###### Lecturer(s): Dr Paul Chleboun, Professor Xavier Didelot, Dr Ioannis Kosmidis,

*Prerequisite(s)**:* Either ST218/219 Mathematical Statistics A&B or ST220 Introduction to Mathematical Statistics

*Content**:* Three self-contained sets of ten lectures. For a description of the topics see below. Please note that the topics covered in this module may change from year to year.

* Commitment:* 3 x 10 lectures in term 2, plus 1 revision class per topic in term 3.

* Assessment:* 100% by 2-hour examination.

###### Topic: Generalized Linear Models with large data sets

**Lecturer:** Dr Ioannis Kosmidis

**Aims:** To introduce core methods and software tools for tackling regression problems that involve large data sets either in terms of number of observations or in terms of explanatory variables.

**Objectives:** By the end of this topics module, students should be able to:

- Define linear and generalized linear models for a data set at hand, fit them, and interpret the output of key methods for regression modelling in R
- Identify least squares as a core optimization problem for fitting linear and generalized linear models, and classify the various methods for its solution in terms of complexity, memory usage and accuracy
- Describe a range of incremental bounded-memory algorithms and stochastic gradient descent algorithms for large regression problems, and contrast them in terms of their relative merits
- Use ready software and tools for estimating linear and generalized linear models from large data sets
- Implement new algorithms for handling large data sets with more complex regression models (e.g. generalized non-linear models or extensions with smooth terms) and alternative estimation techniques (e.g. ridge regression), using new R functionality (if time).

**Recommended:** Familiarity with principles of linear regression and R programming is desirable.

###### Topic: Hidden Markov Models

**Lecturer:** Dr Xavier Didelot

**Aims:** To introduce Hidden Markov Models as a powerful, popular and flexible statistical methodology of analysis for sequential data.

**Objectives:** By the end of this topics module, students should be able to:

- Define Hidden Markov Models
- Describe and implement the algorithms for computing the likelihood, estimating parameters, decoding and forecasting
- Select and test a Hidden Markov Model for a given sequential dataset
- Integrate Hidden Markov Models within a Bayesian analysis
- Describe typical applications of Hidden Markov Models, for example to speech recognition or genetic data analysis

**Topic: Combinatorial Stochastic Process **

**Lecturer: **Dr Paul Chleboun

**Aims:** Combinatorics is an integral part of probability theory. In this course we will discuss generating functions, Bell polynomials and its applications to counting partitions and objects with composite sturctures.

**Objectives:** At the end of the course students should be able to use generating functions for counting and for identifying distributions.

**Recommended:** Basic knowledge of Probability Theory (Expectation, independence).

ST329: Resources for Current Students (restricted access)