The world is filling up with data - billions of images online, billions of supermarket transactions, billions of events pouring out of our everyday lives. Machine Learning is about designing algorithms capable of automatically learning patterns from this supplied data. Examples of this are in online shopping like Amazon.com - which learns what products you like to buy, or in spam detection systems, which learn what spam looks like as you tag it in your spam folder.
In this course unit we will introduce you to the basics of these algorithms, implementing a basic spam filter and a handwriting recognition engine.
Pre-requisites
To enrol students are required to have taken COMP11120 and COMP13212. Or, if you are on a Computer Science and Maths programme you must have taken MATH10111. 24011 is a co-requisite.
The unit aims to introduce of the essential concepts behind key machine learning techniques, the methodologies for building machine learning systems, the approaches for learning from data, and the experimental methods on how to evaluate the performance of a learning system and get the best performance from it. Also, we aim to provide the necessary mathematical background required to understand how the methods work. This course covers basics on both supervised and unsupervised learning paradigms and is pitched towards students with scientific/mathematical background interested in adaptive techniques for learning from data as well as data analysis and modelling.
Specifically, the course will cover the following. A general introduction to key concepts in machine learning and the development of the field. Essential knowledge on how to build a supervised machine learning system, including classification and regression, with respect to model architecture (e.g., instance-based model, linear model, linear basis function model, kernel methods, single- and multi-layer perceptrons, etc.), loss functions (e.g., sum of squares error, regularisation, cross-entropy, etc.), and optimisation approaches for training (e.g., basic optimality conditions, (stochastic) gradient descent, etc.). Basic knowledge on parametric, non-parametric, deterministic and probabilistic models. Essential knowledge on clustering analysis. Essentials on how to measure performance of a machine learning system for classification, regression and clustering. Basics on bias and variance issues, over-fitting and under-fitting. Essential knowledge and practical skills on how to perform machine learning experiments, data usage for model training, model selection and model testing.
Weekly lectures with structured input and exploratory activities. These will be organised as a blend of brief presentations, tutorial question and practice activities, discussions of materials and tasks that are available online, and question-answer sessions.
Bi-weekly laboratories will be drop-in help desks where GTAs provide support for problems provided in lab scripts. These will also be used as surgeries to provide feedback on assessments and as an opportunity to ask questions about the set tasks, and learning materials. Lab scripts contain assessments on mathematical programming for supporting basic machine learning model implementation, also design, implementation and analysis of machine learning techniques for real-world applications.
Cohort-level feedback after marking and individual feedback provided by GTA upon request.
1. Pattern Recognition and Machine Learning. Bishop, Christopher M. ISBN: 978-0387-31073-2. PUblished by springer. 2006. Core material.
2. Introduction to machine learning. Alpaydin, E. ISBN: 978-0-262-02818-9. Published by The MIT Press. 2014.
3. Machine learning: A probabilistic perspective. Murphy, Kevin P. ISBN: 978-0-262-01802-9. Published by The MIT Press. 2012.