- UCAS course code
- GG41
- UCAS institution code
- M20
BSc Computer Science and Mathematics with Industrial Experience
Year of entry: 2024
- View tabs
- View full page
Course unit details:
Generalised Linear Models
Unit code | MATH38171 |
---|---|
Credit rating | 20 |
Unit level | Level 3 |
Teaching period(s) | Semester 1 |
Available as a free choice unit? | No |
Overview
As an important modelling strategy Linear Models is concerned with investigating whether, and how, one or more so-called explanatory variables, such as age, sex, blood pressure, etc., influence a response variable, such as a patient's diagnosis, by taking random variations of data into account. In Linear Models, linear regression technique and Normal distribution are used to explore the possible linear relation between a continuous response and one or more explanatory variables. In this course unit we depart from linearity and normality, the very strict limitation in Linear Models. We study the extension of linearity to non-linearity and normality to a commonly encountered distribution family, called the exponential family of distributions. This extension forms Generalized Linear Models (GLM). The GLM, on the one hand, unifies linear and non-linear models in terms of statistical modelling. On the other hand, it can be used to analyze discrete data, including binary, binomial, counted and categorical data that arise very often in biomedical and industrial applications.
Pre/co-requisites
Unit title | Unit code | Requirement type | Description |
---|---|---|---|
Linear Algebra | MATH11022 | Pre-Requisite | Compulsory |
Probability and Statistics 2 | MATH27720 | Pre-Requisite | Compulsory |
Linear Regression Models | MATH27711 | Pre-Requisite | Compulsory |
Pre-requisite to take either - MATH10202 or MATH10212 – Linear Algebra A or B
Aims
- To introduce the theory and application of generalized linear models, including parameter estimation via numerical maximization of the likelihood function, confidence intervals, hypothesis testing, model selection, model diagnostics, and use of R.
Learning outcomes
After successful completion of the course, students will be able to:
- Formulate appropriate generalized linear models and statistical hypotheses to investigate real-world questions involving non-normal response variables;
- apply generalized linear model techniques to make inferences and predictions about the relationship between covariates and a non-normal response variable, and to shed light on real-world questions;
- check whether the assumptions underpinning such analyses are justified;
- explain and interpret the resulting models and conclusions, with reference to the original real-world questions where appropriate, and also explain key underpinning ideas, assumptions, procedures, and theoretical results;
- prove underpinning mathematical and theoretical results;
- use R to apply the methods and to conduct appropriate simulation studies to evaluate the performance of the methods studied.
Syllabus
Introduction [6]
Motivation for GLMs: binary and count data, brief introduction to logistic and Poisson regression.
Review of linear models, likelihood theory and R programming. Categorical factors and interactions.
Generalized linear models and their estimation [5]
Exponential dispersion family. General form of a GLM including response distribution, link function and linear predictor. Iterative numerical methods for maximum likelihood estimation of the regression coefficients. Comparison of methods for estimation of the dispersion parameter.
Inference, model selection and diagnostics [6]
Wald tests and confidence intervals. Likelihood ratio tests, score tests, and analysis of deviance. Goodness-of-fit tests. AIC and BIC. Stepwise selection. Pearson, deviance and quantile residuals; diagnostic plots.
Applications and extensions [5]
Binary and binomial data: particular issues, separation, goodness-of-fit. Contingency tables, log-linear models. Gamma GLMs. Insurance claims. Credit scoring. Time permitting, a brief introduction to a more advanced extension.
Assessment methods
Method | Weight |
---|---|
Other | 20% |
Written exam | 80% |
- Coursework: 20%
- End of semester examination: weighting 80%
Feedback methods
- Tutorials, labs and office hours give the opportunity to receive individual feedback
Recommended reading
- Dunn, P.K. and Smyth, G.K. (2018). Generalized linear models with examples in R. Springer.
- Agresti, A. (2015). Foundations of linear and generalized linear models. Wiley.
- McCullagh, P. and Nelder, J.A. (1983). Generalized linear models. Chapman and Hall.
- Faraway, J.J. (2016). Extending the linear model with R: generalized linear, mixed effects and nonparametric regression models. CRC press.
Study hours
Scheduled activity hours | |
---|---|
Lectures | 22 |
Practical classes & workshops | 4 |
Tutorials | 6 |
Independent study hours | |
---|---|
Independent study | 168 |
Teaching staff
Staff member | Role |
---|---|
Ian Hall | Unit coordinator |
Timothy Waite | Unit coordinator |
Additional notes
The independent study hours will normally comprise the following. During each week of the taught part of the semester:
• You will normally have approximately 60-75 minutes of video content. Normally you would spend approximately 2-2.5 hrs per week studying this content independently
• You will normally have exercise or problem sheets, on which you might spend approximately 1.5hrs per week
• There may be other tasks assigned to you on Blackboard, for example short quizzes or short- answer formative exercises
• In some weeks you may be preparing coursework or revising for mid-semester tests
Together with the timetabled classes, you should be spending approximately 6 hours per week on this course unit.
The remaining independent study time comprises revision for and taking the end-of-semester assessment.
The above times are indicative only and may vary depending on the week and the course unit. More information can be found on the course unit’s Blackboard page.