MSc Data Science (Earth and Environmental Analytics) / Course details

Year of entry: 2025

Course unit details:
Statistics & Machine Learning 2: AI, Complex Data, Computationally Intensive Statistics

Course unit fact file
Unit code DATA70132
Credit rating 15
Unit level FHEQ level 7 – master's degree or fourth year of an integrated master's degree
Teaching period(s) Semester 2
Available as a free choice unit? No

Overview

The module is delivered as a mixture of lectures and practical sessions and has five main sections:

  1. Dimension reduction and feature extraction: principal components analysis, feature selection, information theory.
  2. Classifiers and clustering: supervised and unsupervised learning, k-means and k-nearest neighbours, agglomerative clustering and dendrograms, support vector machines, linear and quadratic discriminants, Gaussian process classification, model-based clustering, mixture models and the EM algorithm.
  3. Neural Networks and Deep Learning: perceptrons, back-propagation and multi-layer networks.
  4. Markov-chain Monte Carlo (MCMC) methods:  Markov chains and their stationary distributions, likelihood-based inference using the Metropolis-Hastings algorithm, likelihood-free inference using Approximate Bayesian Computation, tests for convergence, applications to Bayesian inference.
  5. Special Topic: Depending on the teaching staff, one special topic will be chosen to go into near-research depth, e.g. Random Forests; Social Networks; Time Series Analysis; Advanced Monte Carlo methods.

Pre/co-requisites

Unit title Unit code Requirement type Description
Statistics and Machine Learning 1: Statistical Foundations DATA70121 Pre-Requisite Compulsory
DATA70121 is a pre-requisite for DATA70132

Aims

The unit aims to introduce students to a selection of modern methods widely used in Data Science that can go beyond standard statistical frameworks. It builds on the foundation laid in Statistics and Machine Learning 1 and is strongly focussed on applications, aiming to train students to be informed users of existing algorithms.

Learning outcomes

Students should be able to:

  • Define the key terms from each of the module’s five sections
  • Understand when to apply a given learning algorithm and how to judge its success, including questions of convergence and computational performance.
  • Construct classifiers that capture features of already-understood data and exploit them to classify new data (supervised learning)
  • Use classification algorithms to discover and exploit previously-unknown structure in data (unsupervised learning).
  • Construct and train neural networks.
  • Use MCMC methods to estimate parameters and quantify uncertainty.
  • Present results, justifying choices of algorithm and communicating effectively with both technical and non-technical audiences.

Teaching and learning methods

The five sections of this module are essentially self-contained subunits. Each consists of a series of lectures that introduce key concepts and serve as support for practical sessions in which the students apply python-based software tools to data analysis problems.

Assessment methods

Method Weight
Written exam 80%
Written assignment (inc essay) 20%

Feedback methods

Feedback will be made available through Turnitin

Recommended reading

 

  • Simon Rogers & Mark Girolami (2017), A First Course in Machine Learning, 2nd edition, Chapman & Hall/CRC. ISBN 9781498738484
  • Christopher Bishop (2006), Pattern Recognition and Machine Learning, Springer-Verlag, New York. ISBN: 9780387310732
  • S. Brooks, A. Gelman, G. Jones, and X.-L. Meng, eds. (2011), Handbook of Markov Chain Monte Carlo, Chapman and Hall/CRC. ISBN: 9781420079418
  • G. James, D. Witten, T. Hastie, and R. Tibshirani (2013), An Introduction to Statistic Learning with Applications in R. Springer-Verlag, New York. ISBN 9781461471370
  • Trevor Hastie, Robert Tibshirani and Jerome Friedman (2009), The Elements of Statistical Learning: Data Mining, Inference, and Prediction. 2nd edition, Springer-Verlag. ISBN: 9780387848587

 

Teaching staff

Staff member Role
Lorenzo Pellis Unit coordinator

Return to course details