- UCAS course code
- G104
- UCAS institution code
- M20
Course unit details:
Multivariate Statistics
Unit code | MATH48061 |
---|---|
Credit rating | 15 |
Unit level | Level 4 |
Teaching period(s) | Semester 1 |
Available as a free choice unit? | No |
Overview
Almost all real data – from physical, biological, and social science, as well as industry and healthcare – involves recording observations of multiple variables. This course concerns the analysis of such multivariate data, from both a theoretical and practical viewpoint. Some techniques generalise on the univariate case – for example, maximum likelihood estimation. Others are new – for example principal component analysis.
Pre/co-requisites
Unit title | Unit code | Requirement type | Description |
---|---|---|---|
Probability 2 | MATH20701 | Pre-Requisite | Compulsory |
Statistical Methods | MATH20802 | Pre-Requisite | Compulsory |
Students are not permitted to take more than one of MATH38161 or MATH48061 for credit in the same undergraduate year. Students are not permitted to take MATH48061 and MATH68061 for credit in an undergraduate programme and then a postgraduate programme.
Aims
To provide a modern overview of multivariate statistics including both the underlying math- ematical theory and practical considerations.
Learning outcomes
On successful completion of the course students will be able to:
- Work with random vectors and matrices to derive results relevant to multivariate sta- tistical inference.
- Import multivariate data stored as plain text into statistical software, visualise the data and run the multivariate analysis techniques covered in the course on it.
- Use data or summary statistics of data to calculate sample mean vectors, variance- covariance matrices, and correlation matrices, as well as to define transformations to simplify analysis.
- Derive the principal components of data with a given covariance structure.
- Define the di¿erence between supervised and unsupervised learning, together with an algorithm for classification of data into two classes for each case.
- Perform unbiased estimation, maximum likelihood estimation and hypothesis testing for multivariate data.
- Derive key properties of the multivariate normal distribution and apply these to the analysis of multivariate data.
- Use contingency tables to test hypotheses and estimate e¿ect sizes for a variety of dis- crete multivariate models.
Syllabus
Mathematical foundations. Revision of vectors, matrices and random variables. New mate- rial on random vectors and random matrices.
Working with data. Constructing the n × p data matrix X from a data file. Sample mean vec- tor and covariance and correlation matrices. Unbiased estimation of population mean and variance-covariance. Transformation of data including Mahalanobis, standardisation and log- arithmic transformation. Visualisation of data including histograms, scatter plots, kernel den- sity plots and plot matrices.
Parametric multivariate statistics. The multivariate normal distribution, including marginal and conditional distributions. Other parametric distributions such as the multivariate log- normal, the multivariate t, and Gaussian mixtures. Maximum likelihood estimation and confi- dence regions for multivariate statistical models. Hypothesis testing and model selection.
Dimensional reduction. Detailed treatment of principal components analysis as well discus- sion of other methods.
Classification. Supervised versus unsupervised learning. Detailed treatment of discriminant analysis and k-means clustering, as well as discussion of other methods.
Discrete multivariate statistics. Discrete multivariate sampling distributions. Construction of contingency tables, and hypothesis testing for di¿erent independence and sampling null models. E¿ect sizes and confidence intervals.
Assessment methods
Method | Weight |
---|---|
Other | 20% |
Written exam | 80% |
- Coursework, which will involve applying methods to real data: weighting 20%
- End of semester examination: weighting 80%
Feedback methods
Feedback methods
Feedback will be provided throughout the course, including:
- In tutorials you will be able to ask for and receive feedback on your work and under- standing.
- You will receive feedback on your coursework.
- You can receive feedback from the lecturer in person during the o¿ice hour or at other times.
- You can receive feedback via the Forum on BlackBoard.
Recommended reading
- C. Chatfield and A. Collins. Introduction to Multivariate Analysis. Chapman & Hall / CRC Texts in Statistical Science. Taylor & Francis, 1981.
An introductory book slightly below the level of the course.
- A. C. Rencher. Multivariate Statistical Inference and Applications. Wiley Series in Prob- ability and Statistics. John Wiley & Sons, New York, 1998.
The main course text.
- Y. Bishop, S. E. Fienberg, and P. W. Holland. Discrete Multivariate Analysis: Theory and Practice. Massachusetts Institute of Technology Press, Cambridge, 1975.
Covers the discrete case.
- S. Rogers and M. Girolami. A First Course in Machine Learning. CRC Press, Boca Raton, Florida, 2 edition, 2016.
Deals with aspects of machine learning relevant to this course.
Study hours
Scheduled activity hours | |
---|---|
Lectures | 11 |
Tutorials | 11 |
Independent study hours | |
---|---|
Independent study | 128 |
Teaching staff
Staff member | Role |
---|---|
Thomas House | Unit coordinator |
Additional notes
The independent study hours will normally comprise the following. During each week of the taught part of the semester:
· You will normally have approximately 75-120 minutes of video content. Normally you would spend approximately 2.5-4 hrs per week studying this content independently
· You will normally have exercise or problem sheets, on which you might spend approximately 2-2.5hrs per week
· There may be other tasks assigned to you on Blackboard, for example short quizzes, short-answer formative exercises or directed reading
· In some weeks you may be preparing coursework or revising for mid-semester tests
Together with the timetabled classes, you should be spending approximately 9 hours per week on this course unit.
The remaining independent study time comprises revision for and taking the end-of-semester assessment.
The above times are indicative only and may vary depending on the week and the course unit. More information can be found on the course unit’s Blackboard page.