- UCAS course code
- GG41
- UCAS institution code
- M20
BSc Computer Science and Mathematics with Industrial Experience
Year of entry: 2024
- View tabs
- View full page
Course unit details:
Mathematical Topics in Machine Learning
Unit code | COMP34312 |
---|---|
Credit rating | 10 |
Unit level | Level 3 |
Teaching period(s) | Semester 2 |
Available as a free choice unit? | Yes |
Overview
Topic 1: Empirical risk minimization, regularisation; bias/variance theory and the relation to overfitting; probabilistic view: likelihood vs loss, introducing exponential families.
Topic 2: Information theory: KL-divergence vs. cross-entropy, mutual information; the view of ML as compression.
Topic 3: Optimization theory (calculus). Why GD? What are convex and non-convex functions? How do gradients inform how we optimize a function? How can we use second order properties? How can we prove whether a method will converge?
Topic 4: Dimensionality reduction (matrix algebra). “refine, denoise, and visualise your data”. Data usually has limited degrees of interest, living on a low-dimensional manifold within a high-dimensional space. This topic will introduce students to matrix-algebra-intensive methods used to learn feature dimensions that can aid in your model fitting process. Examples include PCA, spectral embedding, Fisher discriminant analysis, etc. These allow visualisation, denoising and enhancing the separation of data for classification.
Pre/co-requisites
Unit title | Unit code | Requirement type | Description |
---|---|---|---|
Machine Learning | COMP24112 | Pre-Requisite | Compulsory |
To enrol students are required to have taken COMP24112
Aims
Machine Learning has certain mathematical “building blocks”, which turn up in the study of all types of models and algorithms. Specifically, these building blocks utilise techniques from probability theory, matrix algebra, and calculus.
This module aims to introduce students to these, and then show how to: (1) read and correctly interpret research papers in this context; and (2) understand how novel algorithms are devised in modern ML.
There will be no required coding/practical algorithm development. The module aims to be a stepping-stone toward research, either in industry or in a PhD.
Learning outcomes
- Discuss key mathematical terms in ML, e.g. bias/variance, entropy/cross-entropy, regularisation, the duality between the probabilistic vs. loss function view of ML, and their consequences in practical scenarios
- Correctly manipulate and interpret mathematical expressions for the likelihood of models, entropies and mutual information between random variables.
- Explain taught linear algebra concepts and methods, e.g., vector space/subspace, basis, linear independence, rank, inverse, orthogonality, singular value decomposition, eigen-decomposition.
- Explain and compare the nature and advantages/disadvantages of dimensionality reduction methods, e.g., PCA, spectral embedding, FDA, and how they make use of linear algebra concepts.
- Discuss and interpret data / concepts on convex and non-convex optimisation, including convergence properties and proof techniques to explain stochastic gradient descent.
Syllabus
Topic 1: Empirical risk minimization, regularisation; bias/variance theory and the relation to overfitting; probabilistic view: likelihood vs loss, introducing exponential families.
Topic 2: Information theory: KL-divergence vs. cross-entropy, mutual information; the view of ML as compression.
Topic 3: Optimization theory (calculus). Why GD? What are convex and non-convex functions? How do gradients inform how we optimize a function? How can we use second order properties? How can we prove whether a method will converge?
Topic 4: Dimensionality reduction (matrix algebra). “refine, denoise, and visualise your data”. Data usually has limited degrees of interest, living on a low-dimensional manifold within a high-dimensional space. This topic will introduce students to matrix-algebra-intensive methods used to learn feature dimensions that can aid in your model fitting process. Examples include PCA, spectral embedding, Fisher discriminant analysis, etc. These allow visualisation, denoising and enhancing the separation of data for classification.
Teaching and learning methods
Unit will consist of 4 major topics, delivered in 2-week blocks before Easter. Each topic will consist of videos to watch, and readings to cover, before the interactive sessions. The interactive sessions will act to reinforce the videos/readings. Weekly MCQs in class will be used as formative and summative assessments.
After Easter, a series of carefully selected classic research papers will be read week by week in groups, introducing students to the methods in how to read/interpret research results. Presentations of the papers will consolidate the depth of understanding.
Assessment methods
Method | Weight |
---|---|
Written exam | 100% |
Feedback methods
Correct answers discussed the following week
Recommended reading
Selected chapters: Machine Learning, A Probabilistic Perspective by Kevin Murphy;
Selected chapters: Probability in Data Science, by Sidney Chan
Study hours
Scheduled activity hours | |
---|---|
Assessment written exam | 2 |
Lectures | 11 |
Practical classes & workshops | 11 |
Independent study hours | |
---|---|
Independent study | 76 |
Teaching staff
Staff member | Role |
---|---|
Gavin Brown | Unit coordinator |