Statistical Learning and Big Data

Commitment:

Content:

Big data and big model related issues and (partial) solutions
- "Curse of dimensionality". Multiple testing: voodoo correlations; false-discovery rate and family-wise error rate; corrections: Bonferroni, Benjamini-Hochberg.
- Sparsity and regularisation. Variable selection; regression. Spike and slab priors. Ridge regression. The Lasso. The Dantzig selector.
- Concentration of measure, related inferential issues.
- MCMC in high dimensions, preconditioned Crank Nicolson; MALA; HMC. Preconditioning. Rates of convergence.

Assessment:

Illustrative Bibliography:

Chris Bishop, Pattern recognition and machine learning, Springer, 2006.
Peter Buehlman, Statistics for high-dimensional data: methods, theory and applications, Springer, 2011.
Trevor Hastie, Robert Tibshirani and Jerome Friedman, The elements of statistical learning, Springer, 2009.
Trevor Hastie, Robert Tibshirani and Martin Wainwright, Statistical learning with sparsity, CRC Press, 2015.
Kevin Murphy, Probabilistic machine learning: a probabilistic perspective, MIT Press, 2012.

Examination Period: April