# MA930: Data Analysis and Machine Learning (15 CATS)

### Lecturer: Haoran Ni for 2023-2024

### This module is not available to students outside the MathSys CDT in 2023/24

**Module Aims**

This is one of four core taught modules for the MSc in Mathematics of Real-World Systems. The main aims are to provide the students with a broad knowledge of modern techniques of exploratory data analysis, time series modelling and forecasting, and a short introduction to machine learning. By the end of this module, the students will be able to quantitatively summarise and critically assess data from real-world systems, use modern methods of parameter estimation to model and forecast time-series data, and incorporate observations into mathematical models to reduce the uncertainty in predictions made using these models.

**Syllabus**

- Basic probability: distributions, characteristic functions
- Basic statistics: sample mean and variance, law of large numbers and central-limit theorem
- Frequentist statistics: point estimation, confidence intervals, type-I and II errors, hypothesis tests
- Bayesian statistics: likelihood, maximum likelihood, Bayes theorem, conjugate priors, credible intervals
- Time-series analysis: Autocovariance, parameter inference using time-series, time-series forecasting
- Machine-learning approaches to data analysis

**Teaching**

- Per week: 2 x 2 hours of lectures, 2 x 2 hours of classwork
- Duration: 5 weeks (first half of term 1)

Classes are usually held on Mondays 10:00 - 12:00 and 13:00 - 15:00, and Thursdays 10:00 - 12:00 and 13:00 - 15:00, although this is subject to change. Classes are held in D1.07 (Complexity Seminar Room), Floor 1, Zeeman Building, unless otherwise advised.

**Assessment**

For deadlines see Module Resources page

- Written homework assignments (20%) and
- Written class test (40%) and
- Oral examination (40%)

**Illustrative Bibliography**

G. R. Grimmett and D. R. Stirzaker, *Probability and Random Processes* (3rd edition, OUP, 2001)

J. R. Norris, *Markov Chains* (CUP, 1997)

M. J. Keeling and P. Rohani, *Modeling** infectious diseases in humans and animals *(Princeton University Press, 2007)

C.M. Bishop, Pattern Recognition and Machine Learning, Springer 2006

J.D. Hamilton, Time Series Analysis, Princeton University Press 1994

G.E.P. Box, G.M. Jenkins and G.C. Reisel, Time Series Analysis: Forecasting and Control, Wiley 2016 (fifth ed.). Available as an e-book through the Library.