# ST329: Topics in Statistics

###### Lecturer(s): Dr Paul Chleboun, Professor Xavier Didelot, Dr Ioannis Kosmidis,

Prerequisite(s): Either ST218/219 Mathematical Statistics A&B or ST220 Introduction to Mathematical Statistics

Content: Three self-contained sets of ten lectures. For a description of the topics see below. Please note that the topics covered in this module may change from year to year.

Commitment: 3 x 10 lectures in term 2, plus 1 revision class per topic in term 3.

Assessment: 100% by 2-hour examination.

###### Topic: Generalized Linear Models with large data sets

Lecturer: Dr Ioannis Kosmidis

Aims: To introduce core methods and software tools for tackling regression problems that involve large data sets either in terms of number of observations or in terms of explanatory variables.

Objectives: By the end of this topics module, students should be able to:

• Define linear and generalized linear models for a data set at hand, fit them, and interpret the output of key methods for regression modelling in R
• Identify least squares as a core optimization problem for fitting linear and generalized linear models, and classify the various methods for its solution in terms of complexity, memory usage and accuracy
• Describe a range of incremental bounded-memory algorithms and stochastic gradient descent algorithms for large regression problems, and contrast them in terms of their relative merits
• Use ready software and tools for estimating linear and generalized linear models from large data sets
• Implement new algorithms for handling large data sets with more complex regression models (e.g. generalized non-linear models or extensions with smooth terms) and alternative estimation techniques (e.g. ridge regression), using new R functionality (if time).

Recommended: Familiarity with principles of linear regression and R programming is desirable.

###### Topic: Hidden Markov Models

Lecturer: Dr Xavier Didelot

Aims: To introduce Hidden Markov Models as a powerful, popular and flexible statistical methodology of analysis for sequential data.

Objectives: By the end of this topics module, students should be able to:

• Define Hidden Markov Models
• Describe and implement the algorithms for computing the likelihood, estimating parameters, decoding and forecasting
• Select and test a Hidden Markov Model for a given sequential dataset
• Integrate Hidden Markov Models within a Bayesian analysis
• Describe typical applications of Hidden Markov Models, for example to speech recognition or genetic data analysis
###### Topic: Combinatorial Stochastic Process

Lecturer: Dr Paul Chleboun

Aims: Combinatorics is an integral part of probability theory. In this course we will discuss generating functions, Bell polynomials and its applications to counting partitions and objects with composite sturctures.

Objectives: At the end of the course students should be able to use generating functions for counting and for identifying distributions.

Recommended: Basic knowledge of Probability Theory (Expectation, independence).

You may also wish to see:

ST329: Resources for Current Students (restricted access)