Statistical Learning and Big Data
Commitment:
- 30 hours of lectures + 10 hours tutorials via office hours
 
Content:
- Statistical Learning 
        
- An introduction to statistical learning theory: from over-fitting to apparently complex methods which can work well. For example, VC dimension and shattering sets, PAC bounds.
 - Loss functions. Risk (in the learning theory sense); posterior expected risk. Generalisation error.
 - Supervised, unsupervised and semi-supervised learning.
 - The use of distinct training, test and validation sets particularly in the context of prediction problems.
 - The bootstrap revisited. Bags of little bootstraps. Bootstrap aggregation. Boosting.
 - ML method will be used to illustrate these ideas.
 
 
- Big data and big model related issues and (partial) solutions 
        
- "Curse of dimensionality". Multiple testing: voodoo correlations; false-discovery rate and family-wise error rate; corrections: Bonferroni, Benjamini-Hochberg.
 - Sparsity and regularisation. Variable selection; regression. Spike and slab priors. Ridge regression. The Lasso. The Dantzig selector.
 - Concentration of measure, related inferential issues.
 - MCMC in high dimensions, preconditioned Crank Nicolson; MALA; HMC. Preconditioning. Rates of convergence.
 
 
Assessment:
- 1 x 2-hour exam
 
Illustrative Bibliography:
- Chris Bishop, Pattern recognition and machine learning, Springer, 2006.
 - Peter Buehlman, Statistics for high-dimensional data: methods, theory and applications, Springer, 2011.
 - Trevor Hastie, Robert Tibshirani and Jerome Friedman, The elements of statistical learning, Springer, 2009.
 - Trevor Hastie, Robert Tibshirani and Martin Wainwright, Statistical learning with sparsity, CRC Press, 2015.
 - Kevin Murphy, Probabilistic machine learning: a probabilistic perspective, MIT Press, 2012.
 
Examination Period: April