MMath Mathematics and Statistics / Course details
Year of entry: 2021
- View tabs
- View full page
Course unit details:
|Unit level||Level 4|
|Teaching period(s)||Semester 1|
|Offered by||Department of Mathematics|
|Available as a free choice unit?||No|
Almost all real data – from physical, biological, and social science, as well as industry and healthcare – involves recording observations of multiple variables. This course concerns the analysis of such multivariate data, from both a theoretical and practical viewpoint. Some techniques generalise on the univariate case – for example, maximum likelihood estimation. Others are new – for example principal component analysis.
|Unit title||Unit code||Requirement type||Description|
Students are not permitted to take more than one of MATH38061 or MATH48061 for credit in the same undergraduate year. Students are not permitted to take MATH48061 and MATH68061 for credit in an undergraduate programme and then a postgraduate programme.
To provide a modern overview of multivariate statistics including both the underlying math- ematical theory and practical considerations.
On successful completion of the course students will be able to:
- Work with random vectors and matrices to derive results relevant to multivariate sta- tistical inference.
- Import multivariate data stored as plain text into statistical software, visualise the data and run the multivariate analysis techniques covered in the course on it.
- Use data or summary statistics of data to calculate sample mean vectors, variance- covariance matrices, and correlation matrices, as well as to define transformations to simplify analysis.
- Derive the principal components of data with a given covariance structure.
- Define the di¿erence between supervised and unsupervised learning, together with an algorithm for classification of data into two classes for each case.
- Perform unbiased estimation, maximum likelihood estimation and hypothesis testing for multivariate data.
- Derive key properties of the multivariate normal distribution and apply these to the analysis of multivariate data.
- Use contingency tables to test hypotheses and estimate e¿ect sizes for a variety of dis- crete multivariate models.
Mathematical foundations. Revision of vectors, matrices and random variables. New mate- rial on random vectors and random matrices.
Working with data. Constructing the n × p data matrix X from a data file. Sample mean vec- tor and covariance and correlation matrices. Unbiased estimation of population mean and variance-covariance. Transformation of data including Mahalanobis, standardisation and log- arithmic transformation. Visualisation of data including histograms, scatter plots, kernel den- sity plots and plot matrices.
Parametric multivariate statistics. The multivariate normal distribution, including marginal and conditional distributions. Other parametric distributions such as the multivariate log- normal, the multivariate t, and Gaussian mixtures. Maximum likelihood estimation and confi- dence regions for multivariate statistical models. Hypothesis testing and model selection.
Dimensional reduction. Detailed treatment of principal components analysis as well discus- sion of other methods.
Classification. Supervised versus unsupervised learning. Detailed treatment of discriminant analysis and k-means clustering, as well as discussion of other methods.
Discrete multivariate statistics. Discrete multivariate sampling distributions. Construction of contingency tables, and hypothesis testing for di¿erent independence and sampling null models. E¿ect sizes and confidence intervals.
- Coursework, which will involve applying methods to real data: weighting 20%
- End of semester examination: weighting 80%
Feedback will be provided throughout the course, including:
- In tutorials you will be able to ask for and receive feedback on your work and under- standing.
- You will receive feedback on your coursework.
- You can receive feedback from the lecturer in person during the o¿ice hour or at other times.
- You can receive feedback via the Forum on BlackBoard.
- C. Chatfield and A. Collins. Introduction to Multivariate Analysis. Chapman & Hall / CRC Texts in Statistical Science. Taylor & Francis, 1981.
An introductory book slightly below the level of the course.
- A. C. Rencher. Multivariate Statistical Inference and Applications. Wiley Series in Prob- ability and Statistics. John Wiley & Sons, New York, 1998.
The main course text.
- Y. Bishop, S. E. Fienberg, and P. W. Holland. Discrete Multivariate Analysis: Theory and Practice. Massachusetts Institute of Technology Press, Cambridge, 1975.
Covers the discrete case.
- S. Rogers and M. Girolami. A First Course in Machine Learning. CRC Press, Boca Raton, Florida, 2 edition, 2016.
Deals with aspects of machine learning relevant to this course.
|Scheduled activity hours|
|Independent study hours|
|Thomas House||Unit coordinator|
This course unit detail provides the framework for delivery in 20/21 and may be subject to change due to any additional Covid-19 impact.
Please see Blackboard / course unit related emails for any further updates.