Skip to main content Skip to navigation

MA3K1 Content


Fundamentals of statistical learning theory:

  • Regression and classification
  • Empirical risk minimization and regulation
  • VC theory


  • Basic algorithms (gradient descent, Newton’s method)
  • Convexity, Lagrange duality and KKT theory
  • Quadratic optimization and support vector machines
  • Subgradients and nonsmooth analysis
  • Proximal gradient methods
  • Accelerated and stochastic algorithms

Machine learning:

  • Neural networks and deep learning
  • Stochastic gradient descent
  • Kernel methods and Gaussian processes
  • Recurrent neural networks
  • Applications (pattern recognition, time series prediction)
  • Applications (pattern recognition, time series prediction)


The aim of this course is to introduce Machine Learning from the point of view of modern optimization and approximation theory.


By the end of the module the student should be able to:

  • Describe the problem of supervised learning from the point of view of function approximation, optimization, and statistics
  • Identify the most suitable optimization and modelling approach for a given machine learning problem
  • Analyse the performance of various optimization algorithms from the point of view of computational complexity (both space and time) and statistical accuracy
  • Implement a simple neural network architecture and apply it to a pattern recognition task


  1. Friedman, Jerome, Trevor Hastie, and Robert Tibshirani. The elements of statistical learning. Springer series in statistics, 2001.
  2. Beck, Amir. First-Order Methods in Optimization. Vol. 25. SIAM, 2017.
  3. Vapnik, Vladimir. The nature of statistical learning theory. Springer, 2013.
  4. Cucker, Felipe, and Ding Xuan Zhou. Learning theory: an approximation theory viewpoint. Vol. 24. Cambridge University Press, 2007.

5. Higham, Catherine F. and Desmond J. Higham. Deep Learning: An Introduction for Applied Mathematicians.   arXiv preprint arXiv:1801.05894 (2018).