MMath Mathematics / Course details

Year of entry: 2024

Course unit details:
Multivariate Statistics

Course unit fact file
Unit code MATH48061
Credit rating 15
Unit level Level 4
Teaching period(s) Semester 1
Available as a free choice unit? No

Overview

Almost all real data – from physical, biological, and social science, as well as industry and healthcare – involves recording observations of multiple variables. This course concerns the analysis of such multivariate data, from both a theoretical and practical viewpoint. Some techniques generalise on the univariate case – for example, maximum likelihood estimation. Others are new – for example principal component analysis.

Pre/co-requisites

Unit title Unit code Requirement type Description
Probability 2 MATH20701 Pre-Requisite Compulsory
Statistical Methods MATH20802 Pre-Requisite Compulsory
MATH48061 pre-requisites

Students are not permitted to take more than one of MATH38161 or MATH48061 for credit in the same undergraduate year.  Students are not permitted to take MATH48061 and MATH68061 for credit in an undergraduate programme and then a postgraduate programme.

Aims

To provide a modern overview of multivariate statistics including both the underlying math- ematical theory and practical considerations.

Learning outcomes

On successful completion of the course students will be able to:

  • Work with random vectors and matrices to derive results relevant to multivariate sta- tistical inference.
  • Import multivariate data stored as plain text into statistical software, visualise the data and run the multivariate analysis techniques covered in the course on it.
  • Use data or summary statistics of data to calculate sample mean vectors, variance- covariance matrices, and correlation matrices, as well as to define transformations to simplify analysis.
  • Derive the principal components of data with a given covariance structure.
  • Define the di¿erence between supervised and unsupervised learning, together with an algorithm for classification of data into two classes for each case.
  • Perform unbiased estimation, maximum likelihood estimation and hypothesis testing for multivariate data.
  • Derive key properties of the multivariate normal distribution and apply these to the analysis of multivariate data.
  • Use contingency tables to test hypotheses and estimate e¿ect sizes for a variety of dis- crete multivariate models.

Syllabus

Mathematical foundations. Revision of vectors, matrices and random variables. New mate- rial on random vectors and random matrices.

Working with data. Constructing the n × p data matrix X from a data file. Sample mean vec- tor and covariance and correlation matrices. Unbiased estimation of population mean and variance-covariance. Transformation of data including Mahalanobis, standardisation and log- arithmic transformation. Visualisation of data including histograms, scatter plots, kernel den- sity plots and plot matrices.

Parametric multivariate statistics. The multivariate normal distribution, including marginal and conditional distributions. Other parametric distributions such as the multivariate log- normal, the multivariate t, and Gaussian mixtures. Maximum likelihood estimation and confi- dence regions for multivariate statistical models. Hypothesis testing and model selection.

Dimensional reduction. Detailed treatment of principal components analysis as well discus- sion of other methods.

Classification. Supervised versus unsupervised learning. Detailed treatment of discriminant analysis and k-means clustering, as well as discussion of other methods.

Discrete multivariate statistics.  Discrete multivariate sampling distributions.  Construction of contingency tables, and hypothesis testing for di¿erent independence and sampling null models. E¿ect sizes and confidence intervals.

Assessment methods

Method Weight
Other 20%
Written exam 80%
  • Coursework, which will involve applying methods to real data: weighting 20%
  • End of semester examination: weighting 80%

Feedback methods

Feedback methods

Feedback will be provided throughout the course, including:

  • In tutorials you will be able to ask for and receive feedback on your work and under- standing.
  • You will receive feedback on your coursework.
  • You can receive feedback from the lecturer in person during the o¿ice hour or at other times.
  • You can receive feedback via the Forum on BlackBoard.

Recommended reading

  • C. Chatfield and A. Collins. Introduction to Multivariate Analysis. Chapman & Hall / CRC Texts in Statistical Science. Taylor & Francis, 1981.

An introductory book slightly below the level of the course.

  • A. C. Rencher. Multivariate Statistical Inference and Applications. Wiley Series in Prob- ability and Statistics. John Wiley & Sons, New York, 1998.

The main course text.

  • Y. Bishop, S. E. Fienberg, and P. W. Holland.  Discrete Multivariate Analysis: Theory and Practice. Massachusetts Institute of Technology Press, Cambridge, 1975.

Covers the discrete case.

  • S. Rogers and M. Girolami. A First Course in Machine Learning. CRC Press, Boca Raton, Florida, 2 edition, 2016.

Deals with aspects of machine learning relevant to this course.

Study hours

Scheduled activity hours
Lectures 11
Tutorials 11
Independent study hours
Independent study 128

Teaching staff

Staff member Role
Thomas House Unit coordinator

Additional notes

The independent study hours will normally comprise the following. During each week of the taught part of the semester:

·         You will normally have approximately 75-120 minutes of video content. Normally you would spend approximately 2.5-4 hrs per week studying this content independently

·         You will normally have exercise or problem sheets, on which you might spend approximately 2-2.5hrs per week

·         There may be other tasks assigned to you on Blackboard, for example short quizzes, short-answer formative exercises or directed reading

·         In some weeks you may be preparing coursework or revising for mid-semester tests

Together with the timetabled classes, you should be spending approximately 9 hours per week on this course unit.

The remaining independent study time comprises revision for and taking the end-of-semester assessment.

The above times are indicative only and may vary depending on the week and the course unit. More information can be found on the course unit’s Blackboard page.

 

Return to course details