MSc Health Data Science
Year of entry: 2022
- View tabs
- View full page
Course unit details:
Fundamental Mathematics & Statistics for Health Data
|Unit level||FHEQ level 7 – master's degree or fourth year of an integrated master's degree|
|Teaching period(s)||Semester 1|
|Offered by||Division of Informatics, Imaging and Data Sciences|
|Available as a free choice unit?||No|
Currently, there is a large amount of health and related data that is not analysed in order to provide insights into healthcare delivery. A core skill required of a health data scientist is to be able to analyse various forms of health data; the unit will cover the fundamental knowledge required to do this including understanding data; pre-processing steps; key analytical skills and a suitable statistical programming language. The unit will introduce students into what can be achieved through the analysis of health data. Key research questions will be drawn from HeRC to illustrate these techniques.
To introduce the student to a range of mathematical and statistical techniques that are widely used when analysing health data, and to demonstrate what can be acheived when using health data. The focus is on explaining which techniques are available and how best to use them, rather than going into details of how a method works (it is more useful to know how a method could be implemented and when to use rather than understand theoretical principles). On completion the student should be aware of a range of techniques and have experience of writing scripts and executing them in R.
- Overview of key statistical measures and theory
- Introductory mathematical toolkit - Probability theory
- Data pre-processing - Cleaning, visualisation and integrity checks
- Supervised learining - Goodness of fit, Rik/loss funcations (correlation, RMSE, accuracy, sensitivity, specificity); Univariable methods and distributions; Normal, Poisson, Bionomial, Multinomial, Gamma-families; Chi-squared test, t-test, Wilcoxon, ANOVA, Kruskal-Wallis
- Classification and Regression - Linear models; Generalised linear models
- Data mining/unsupervised learning - Principle Component Analysis; Hierarchical clustering; Partitional clustering
- Study designs - Trial-type design; Cohort (observational) studies; Confounding and causality.
Teaching and learning methods
The course will be taught in a blended-learning format: basic knowledge and directed reading will be provided via eLearning so as to introduce students with key knowledge. The face-to-face time will consist of a series of lectures and discussions in which core concepts (introduced through pre-reading) will be re-capped and any further development discussed, as well as supervised computer time during which practical software and programming problems will be explored. When possible, lectures will be recorded and distributed online. The F2F workshops will be delivered in 3 x two day sessions, each focussing on different analysis methods. Associated with with each key concept will be a practical exercise to assess the understanding of the students. The unit assessment will require the student to write scripts to demonstrate one or more of the techniques covered in the unit and a written report describing the justification of method; key findings and working. There will be online tutor support for the eLearning preparation and two designated (virtual/F2F) tutorials will be made available for students with academic staff to ensure that students are continuing to progress through the unit.
Knowledge and understanding
- Identify and describe the steps required to analyse a set of health data
- Describe and explain key ideas and concepts in statistics
- Know how to prepare data sets for analysis
- Appreciate the key concepts in study design
- Understand the ideas behind building mathematical models
- Represent problems mathematically, in a way in which allows statistical scripts to be written to solve them
- Identify and apply approriate mathematical and statistical methods to analyse/model a given health data set
- Execute and write statistical scripts to compute basic statistical measures and analyse data
- Prepare a data sete for analysis
Transferable skills and personal qualities
- Demonstrate statisical programming skills
- Demonstrate experience of solving applied problems
- Communicate analyses and interpretation of results of health data
- Analytical skills
- Demonstrate statistical programming skills
- Problem solving
- Demonstrate experience of solving problems
2 x Data Analysis/Programming Assignments.
Each assignment will include statiscal scripts demonstrating the analysis of data and a written report (in paper style) to justify methods and explanation of work.
Max 1500 words each. Equal weighting between assignments.
Formative assessment and feedback to students is a key feature of the on-line learning materials for this unit. Students will be required to engage in a wide range of interactive exercises to enhance their learning and test their developing knowledge and skills. In addition, there will be a series of supervised pratical hands-on exercises that will allow for verbal feedback.
Spiegelhalter, D. (2019). The Art of Statistics: Learning from Data. Penguin UK.
Grinstead, C. M., & Snell, J. L. (2012). Introduction to probability. American Mathematical Soc.. (Chapters 1-9).
Kirkwood, B. R., & Sterne, J. A. (2010). Essential medical statistics. John Wiley & Sons. (Chapters 1-21).
|Scheduled activity hours|
|Independent study hours|
|Hui Guo||Unit coordinator|