BSc Computer Science and Mathematics with Industrial Experience

Year of entry: 2024

Course unit details:
Generalised Linear Models

Course unit fact file
Unit code MATH38171
Credit rating 20
Unit level Level 3
Teaching period(s) Semester 1
Available as a free choice unit? No

Overview

As an important modelling strategy Linear Models is concerned with investigating whether, and how, one or more so-called explanatory variables, such as age, sex, blood pressure, etc., influence a response variable, such as a patient's diagnosis, by taking random variations of data into account. In Linear Models, linear regression technique and Normal distribution are used to explore the possible linear relation between a continuous response and one or more explanatory variables. In this course unit we depart from linearity and normality, the very strict limitation in Linear Models. We study the extension of linearity to non-linearity and normality to a commonly encountered distribution family, called the exponential family of distributions. This extension forms Generalized Linear Models (GLM). The GLM, on the one hand, unifies linear and non-linear models in terms of statistical modelling. On the other hand, it can be used to analyze discrete data, including binary, binomial, counted and categorical data that arise very often in biomedical and industrial applications.

Pre/co-requisites

Unit title Unit code Requirement type Description
Linear Algebra MATH11022 Pre-Requisite Compulsory
Probability and Statistics 2 MATH27720 Pre-Requisite Compulsory
Linear Regression Models MATH27711 Pre-Requisite Compulsory
MATH38171 Pre-Reqs

Pre-requisite to take either - MATH10202 or MATH10212 – Linear Algebra A or B

 

Aims

  • To introduce the theory and application of generalized linear models, including parameter estimation via numerical maximization of the likelihood function, confidence intervals, hypothesis testing, model selection, model diagnostics, and use of R.

 

Learning outcomes

After successful completion of the course, students will be able to:

  1. Formulate appropriate generalized linear models and statistical hypotheses to investigate real-world questions involving non-normal response variables;

 

  1.  apply generalized linear model techniques to make inferences and predictions about the relationship between covariates and a non-normal response variable, and to shed light on real-world questions;

 

  1. check whether the assumptions underpinning such analyses are justified;

 

  1. explain and interpret the resulting models and conclusions, with reference to the original real-world questions where appropriate, and also explain key underpinning ideas, assumptions, procedures, and theoretical results;

 

  1. prove underpinning mathematical and theoretical results;

 

  1. use R to apply the methods and to conduct appropriate simulation studies to evaluate the performance of the methods studied.

 

 

Syllabus

Introduction [6]

Motivation for GLMs: binary and count data, brief introduction to logistic and Poisson regression.  

Review of linear models, likelihood theory and R programming. Categorical factors and interactions.

 

Generalized linear models and their estimation [5]

Exponential dispersion family. General form of a GLM including response distribution, link function and linear predictor. Iterative numerical methods for maximum likelihood estimation of the regression coefficients. Comparison of methods for estimation of the dispersion parameter.

 

Inference, model selection and diagnostics [6]

Wald tests and confidence intervals. Likelihood ratio tests, score tests, and analysis of deviance. Goodness-of-fit tests. AIC and BIC. Stepwise selection. Pearson, deviance and quantile residuals; diagnostic plots.

 

Applications and extensions [5]

Binary and binomial data: particular issues, separation, goodness-of-fit. Contingency tables, log-linear models. Gamma GLMs. Insurance claims. Credit scoring. Time permitting, a brief introduction to a more advanced extension.

Assessment methods

Method Weight
Other 20%
Written exam 80%
  • Coursework: 20%
  • End of semester examination: weighting 80%

Feedback methods

- Tutorials, labs and office hours give the opportunity to receive individual feedback

Recommended reading

  • Dunn, P.K. and Smyth, G.K. (2018). Generalized linear models with examples in R. Springer.
  • Agresti, A. (2015). Foundations of linear and generalized linear models. Wiley.
  • McCullagh, P. and Nelder, J.A. (1983). Generalized linear models. Chapman and Hall.
  • Faraway, J.J. (2016). Extending the linear model with R: generalized linear, mixed effects and nonparametric regression models. CRC press.

Study hours

Scheduled activity hours
Lectures 22
Practical classes & workshops 4
Tutorials 6
Independent study hours
Independent study 168

Teaching staff

Staff member Role
Ian Hall Unit coordinator
Timothy Waite Unit coordinator

Additional notes

The independent study hours will normally comprise the following. During each week of the taught part of the semester:
 
• You will normally have approximately 60-75 minutes of video content. Normally you would    spend approximately 2-2.5 hrs per week studying this content independently


• You will normally have exercise or problem sheets, on which you might spend approximately 1.5hrs per week


•  There may be other tasks assigned to you on Blackboard, for example short quizzes or short- answer formative exercises


• In some weeks you may be preparing coursework or revising for mid-semester tests
 
Together with the timetabled classes, you should be spending approximately 6 hours per week on this course unit.


The remaining independent study time comprises revision for and taking the end-of-semester assessment.
 
The above times are indicative only and may vary depending on the week and the course unit. More information can be found on the course unit’s Blackboard page.

Return to course details