Information regarding our 2023/24 admissions cycle

Our 2023/24 postgraduate taught admissions cycle will open on Monday, 10 October. For most programmes, the application form will not open until this date.

MSc Data Science (Applied Urban Analytics) / Course details

Year of entry: 2023

Course unit details:
Statistics & Machine Learning 2: AI, Complex Data, Computationally Intensive Statistics

Course unit fact file
Unit code DATA70132
Credit rating 15
Unit level FHEQ level 7 – master's degree or fourth year of an integrated master's degree
Teaching period(s) Semester 2
Offered by
Available as a free choice unit? No

Overview

The module is delivered as a mixture of lectures and practical sessions and has five main sections:

  1. Dimension reduction and feature extraction: principal components analysis, feature selection, information theory.
  2. Classifiers and clustering: supervised and unsupervised learning, k-means and k-nearest neighbours, agglomerative clustering and dendrograms, support vector machines, linear and quadratic discriminants, Gaussian process classification, model-based clustering, mixture models and the EM algorithm.
  3. Neural Networks and Deep Learning: perceptrons, back-propagation and multi-layer networks.
  4. Markov-chain Monte Carlo (MCMC) methods:  Markov chains and their stationary distributions, likelihood-based inference using the Metropolis-Hastings algorithm, likelihood-free inference using Approximate Bayesian Computation, tests for convergence, applications to Bayesian inference.
  5. Special Topic: Depending on the teaching staff, one special topic will be chosen to go into near-research depth, e.g. Random Forests; Social Networks; Advanced Monte Carlo methods.

Aims

The unit aims to:

Introduce students to a selection of modern methods widely used in Data Science that can go beyond standard statistical frameworks. It builds on the foundation laid in Statistics and Machine Learning 1 and is strongly focussed on applications, aiming to train students to be informed users of existing algorithms.

Learning outcomes

Students should be able to:

  • Define the key terms from each of the module’s five sections
  • Understand when to apply a given learning algorithm and how to judge its success, including questions of convergence and computational performanc
  • Construct classifiers that capture features of already-understood data and exploit them to classify new data (supervised learning)
  • Use classification algorithms to discover and exploit previously-unknown structure in data (unsupervised learning)
  • Construct and train neural networks
  • Use MCMC methods to estimate parameters and quantify uncertainty
  • Present justifying choices of algorithm and communicating effectively with both technical and non-technical audiences.

Teaching and learning methods

The five sections of this module are essentially self-contained subunits. Each consists of a series of lectures that introduce key concepts and serve as support for practical sessions in which the students apply python-based software tools to data analysis problems.

Assessment methods

Assessment task

Length

How and when feedback is provided

Weighting

 

Written exam:

1 hour

40%

Generic feedback available to the whole cohort.

 

 
 

Four written coursework assignments, that will include computational exercises.

2*1000 words and 2*500 words each plus figures.

Feedback methods

Feedback will be made available through Turnitin

Recommended reading

  •            Simon Rogers & Mark Girolami (2017), A First Course in Machine Learning, 2nd edition, Chapman & Hall/CRC. ISBN 9781498738484
  •            Christopher Bishop (2006), Pattern Recognition and Machine Learning, Springer-Verlag, New York. ISBN: 9780387310732
  •            S. Brooks, A. Gelman, G. Jones, and X.-L. Meng, eds. (2011), Handbook of Markov Chain Monte Carlo, Chapman and Hall/CRC. ISBN: 9781420079418
  •           G. James, D. Witten, T. Hastie, and R. Tibshirani (2013), An Introduction to Statistic Learning with Applications in R. Springer-Verlag, New York. ISBN 9781461471370
  •            Trevor Hastie, Robert Tibshirani and Jerome Friedman (2009), The Elements of Statistical Learning: Data Mining, Inference, and Prediction. 2nd edition, Springer-Verlag. ISBN: 9780387848587 

Teaching staff

Staff member Role
Mark Muldoon Unit coordinator

Return to course details