MSc Data Science (Earth and Environmental Analytics) / Course details
Year of entry: 2025
- View tabs
- View full page
Course unit details:
Statistics & Machine Learning 2: AI, Complex Data, Computationally Intensive Statistics
Unit code | DATA70132 |
---|---|
Credit rating | 15 |
Unit level | FHEQ level 7 – master's degree or fourth year of an integrated master's degree |
Teaching period(s) | Semester 2 |
Available as a free choice unit? | No |
Overview
The module is delivered as a mixture of lectures and practical sessions and has five main sections:
- Dimension reduction and feature extraction: principal components analysis, feature selection, information theory.
- Classifiers and clustering: supervised and unsupervised learning, k-means and k-nearest neighbours, agglomerative clustering and dendrograms, support vector machines, linear and quadratic discriminants, Gaussian process classification, model-based clustering, mixture models and the EM algorithm.
- Neural Networks and Deep Learning: perceptrons, back-propagation and multi-layer networks.
- Markov-chain Monte Carlo (MCMC) methods: Markov chains and their stationary distributions, likelihood-based inference using the Metropolis-Hastings algorithm, likelihood-free inference using Approximate Bayesian Computation, tests for convergence, applications to Bayesian inference.
- Special Topic: Depending on the teaching staff, one special topic will be chosen to go into near-research depth, e.g. Random Forests; Social Networks; Time Series Analysis; Advanced Monte Carlo methods.
Pre/co-requisites
Unit title | Unit code | Requirement type | Description |
---|---|---|---|
Statistics and Machine Learning 1: Statistical Foundations | DATA70121 | Pre-Requisite | Compulsory |
Aims
The unit aims to introduce students to a selection of modern methods widely used in Data Science that can go beyond standard statistical frameworks. It builds on the foundation laid in Statistics and Machine Learning 1 and is strongly focussed on applications, aiming to train students to be informed users of existing algorithms.
Learning outcomes
Students should be able to:
- Define the key terms from each of the module’s five sections
- Understand when to apply a given learning algorithm and how to judge its success, including questions of convergence and computational performance.
- Construct classifiers that capture features of already-understood data and exploit them to classify new data (supervised learning)
- Use classification algorithms to discover and exploit previously-unknown structure in data (unsupervised learning).
- Construct and train neural networks.
- Use MCMC methods to estimate parameters and quantify uncertainty.
- Present results, justifying choices of algorithm and communicating effectively with both technical and non-technical audiences.
Teaching and learning methods
The five sections of this module are essentially self-contained subunits. Each consists of a series of lectures that introduce key concepts and serve as support for practical sessions in which the students apply python-based software tools to data analysis problems.
Assessment methods
Method | Weight |
---|---|
Written exam | 80% |
Written assignment (inc essay) | 20% |
Feedback methods
Feedback will be made available through Turnitin
Recommended reading
- Simon Rogers & Mark Girolami (2017), A First Course in Machine Learning, 2nd edition, Chapman & Hall/CRC. ISBN 9781498738484
- Christopher Bishop (2006), Pattern Recognition and Machine Learning, Springer-Verlag, New York. ISBN: 9780387310732
- S. Brooks, A. Gelman, G. Jones, and X.-L. Meng, eds. (2011), Handbook of Markov Chain Monte Carlo, Chapman and Hall/CRC. ISBN: 9781420079418
- G. James, D. Witten, T. Hastie, and R. Tibshirani (2013), An Introduction to Statistic Learning with Applications in R. Springer-Verlag, New York. ISBN 9781461471370
- Trevor Hastie, Robert Tibshirani and Jerome Friedman (2009), The Elements of Statistical Learning: Data Mining, Inference, and Prediction. 2nd edition, Springer-Verlag. ISBN: 9780387848587
Teaching staff
Staff member | Role |
---|---|
Lorenzo Pellis | Unit coordinator |