MSc Health Data Science

Year of entry: 2021

Coronavirus information for applicants and offer-holders

We understand that prospective students and offer-holders may have concerns about the ongoing coronavirus outbreak. The University is following the advice from Universities UK, Public Health England and the Foreign and Commonwealth Office.

Read our latest coronavirus information

Course unit details:
Fundamental Mathematics & Statistics for Health Data

Unit code IIDS67631
Credit rating 15
Unit level FHEQ level 7 – master's degree or fourth year of an integrated master's degree
Teaching period(s) Semester 1
Offered by Division of Informatics, Imaging and Data Sciences
Available as a free choice unit? No

Overview

Currently, there is a large amount of health and related data that is not analysed in order to provide insights into healthcare delivery.  A core skill required of a health data scientist is to be able to analyse various forms of health data; the unit will cover the fundamental knowledge required to do this including understanding data; pre-processing steps; key analytical skills and a suitable statistical programming language.  The unit will introduce students into what can be achieved through the analysis of health data.  Key research questions will be drawn from HeRC to illustrate these techniques.

Aims

To introduce the student to a range of mathematical and statistical techniques that are widely used when analysing health data, and to demonstrate what can be acheived when using health data. The focus is on explaining which techniques are available and how best to use them, rather than going into details of how a method works (it is more useful to know how a method could be implemented and when to use rather than understand theoretical principles). On completion the student should be aware of a range of techniques and have experience of writing scripts and executing them in R.

Learning outcomes

Indicative content:

  • Overview of key statistical measures and theory
  • Introductory mathematical toolkit - Probability theory
  • Data pre-processing - Cleaning, visualisation and integrity checks
  • Supervised learining - Goodness of fit, Rik/loss funcations (correlation, RMSE, accuracy, sensitivity, specificity); Univariable methods and distributions; Normal, Poisson, Bionomial, Multinomial, Gamma-families; Chi-squared test, t-test, Wilcoxon, ANOVA, Kruskal-Wallis
  • Classification and Regression - Linear models; Generalised linear models
  • Data mining/unsupervised learning - Principle Component Analysis; Hierarchical clustering; Partitional clustering
  • Study designs - Trial-type design; Cohort (observational) studies; Confounding and causality.

Teaching and learning methods

The course will be taught in a blended-learning format: basic knowledge and directed reading will be provided via eLearning so as to introduce students with key knowledge.  The face-to-face time will consist of a series of lectures and discussions in which core concepts (introduced through pre-reading) will be re-capped and any further development discussed, as well as supervised computer time during which practical software and programming problems will be explored.  When possible, lectures will be recorded and distributed online.  The F2F workshops will be delivered in 3 x two day sessions, each focussing on different analysis methods.  Associated with with each key concept will be a practical exercise to assess the understanding of the students.  The unit assessment will require the student to write scripts to demonstrate one or more of the techniques covered in the unit and a written report describing the justification of method; key findings and working.  There will be online tutor support for the eLearning preparation and two designated (virtual/F2F) tutorials will be made available for students with academic staff to ensure that students are continuing to progress through the unit.

Knowledge and understanding

  • Identify and describe the steps required to analyse a set of health data
  • Describe and explain key ideas and concepts in statistics
  • Know how to prepare data sets for analysis
  • Appreciate the key concepts in study design
  • Understand the ideas behind building mathematical models

Intellectual skills

  • Represent problems mathematically, in a way in which allows statistical scripts to be written to solve them
  • Identify and apply approriate mathematical and statistical methods to analyse/model a given health data set

Practical skills

  • Execute and write statistical scripts to compute basic statistical measures and analyse data
  • Prepare a data sete for analysis

Transferable skills and personal qualities

  • Demonstrate statisical programming skills
  • Demonstrate experience of solving applied problems
  • Communicate analyses and interpretation of results of health data

Employability skills

Analytical skills
Demonstrate statistical programming skills
Problem solving
Demonstrate experience of solving problems

Assessment methods

3 x Data Analysis/Programming Assignments.

Each assignment will include statiscal scripts demonstrating the analysis of data and a written report (in paper style) to justify methods and explanation of work.

Max 1500 words each.  Equal weighting between assignments.

Feedback methods

Formative assessment and feedback to students is a key feature of the on-line learning materials for this unit.  Students will be required to engage in a wide range of interactive exercises to enhance their learning and test their developing knowledge and skills.  In addition, there will be a series of supervised pratical hands-on exercises that will allow for verbal feedback.

Study hours

Scheduled activity hours
Lectures 42
Independent study hours
Independent study 108

Teaching staff

Staff member Role
Hui Guo Unit coordinator

Return to course details