Information regarding our 2023/24 admissions cycle

Our 2023/24 postgraduate taught admissions cycle will open on Monday, 10 October. For most programmes, the application form will not open until this date.

MSc Data Science (Applied Urban Analytics)

Year of entry: 2023

Course unit details:
Statistics & Machine Learning 2: AI, Complex Data, Computationally Intensive Statistics

Course unit fact file
Unit code DATA70132
Credit rating 15
Unit level FHEQ level 7 – master's degree or fourth year of an integrated master's degree
Teaching period(s) Semester 2
Offered by
Available as a free choice unit? No

Overview

The module is delivered as a mixture of lectures and practical sessions and has five main sections:

  1. Dimension reduction and feature extraction: principal components analysis, feature selection, information theory.
  2. Classifiers and clustering: supervised and unsupervised learning, k-means and k-nearest neighbours, agglomerative clustering and dendrograms, support vector machines, linear and quadratic discriminants, Gaussian process classification, model-based clustering, mixture models and the EM algorithm.
  3. Neural Networks and Deep Learning: perceptrons, back-propagation and multi-layer networks.
  4. Markov-chain Monte Carlo (MCMC) methods:  Markov chains and their stationary distributions, likelihood-based inference using the Metropolis-Hastings algorithm, likelihood-free inference using Approximate Bayesian Computation, tests for convergence, applications to Bayesian inference.
  5. Special Topic: Depending on the teaching staff, one special topic will be chosen to go into near-research depth, e.g. Random Forests; Social Networks; Advanced Monte Carlo methods.

Aims

The unit aims to:

Introduce students to a selection of modern methods widely used in Data Science that can go beyond standard statistical frameworks. It builds on the foundation laid in Statistics and Machine Learning 1 and is strongly focussed on applications, aiming to train students to be informed users of existing algorithms.

Learning outcomes

Students should be able to:

  • Define the key terms from each of the module’s five sections
  • Understand when to apply a given learning algorithm and how to judge its success, including questions of convergence and computational performanc
  • Construct classifiers that capture features of already-understood data and exploit them to classify new data (supervised learning)
  • Use classification algorithms to discover and exploit previously-unknown structure in data (unsupervised learning)
  • Construct and train neural networks
  • Use MCMC methods to estimate parameters and quantify uncertainty
  • Present justifying choices of algorithm and communicating effectively with both technical and non-technical audiences.

Teaching and learning methods

The five sections of this module are essentially self-contained subunits. Each consists of a series of lectures that introduce key concepts and serve as support for practical sessions in which the students apply python-based software tools to data analysis problems.

Assessment methods

Assessment task

Length

How and when feedback is provided

Weighting

 

Written exam:

1 hour

40%

Generic feedback available to the whole cohort.

 

 
 

Four written coursework assignments, that will include computational exercises.

2*1000 words and 2*500 words each plus figures.

Feedback methods

Feedback will be made available through Turnitin

Recommended reading

  •            Simon Rogers & Mark Girolami (2017), A First Course in Machine Learning, 2nd edition, Chapman & Hall/CRC. ISBN 9781498738484
  •            Christopher Bishop (2006), Pattern Recognition and Machine Learning, Springer-Verlag, New York. ISBN: 9780387310732
  •            S. Brooks, A. Gelman, G. Jones, and X.-L. Meng, eds. (2011), Handbook of Markov Chain Monte Carlo, Chapman and Hall/CRC. ISBN: 9781420079418
  •           G. James, D. Witten, T. Hastie, and R. Tibshirani (2013), An Introduction to Statistic Learning with Applications in R. Springer-Verlag, New York. ISBN 9781461471370
  •            Trevor Hastie, Robert Tibshirani and Jerome Friedman (2009), The Elements of Statistical Learning: Data Mining, Inference, and Prediction. 2nd edition, Springer-Verlag. ISBN: 9780387848587 

Teaching staff

Staff member Role
Mark Muldoon Unit coordinator

Return to course details