MSc Health Data Science

Year of entry: 2025

Course unit details:
Programming for Health Data Science

Course unit fact file
Unit code IIDS69061
Credit rating 15
Unit level FHEQ level 7 – master's degree or fourth year of an integrated master's degree
Teaching period(s) Semester 1
Available as a free choice unit? No

Overview

  • Fundamental data types and structures
  • Core programming concepts such as iteration, selection, file handling and control flow
  • Develop awareness of programming languages used in Data Science (e.g. Python, R, Julia)
  • The fundamentals of using a modern programming language (e.g. Python) for data science and associated commonly used libraries/modules
  • How data is stored and accessed using database systems
  • Data manipulation and pre-processing (cleaning, imputation, encoding and transforming data)
  • Combining datasets (data linkage)

Aims

The unit aims to:

  • Facilitate the practice of writing code scripts in a modern programming language suitable for data science tasks, while applying best software engineering practices & standards
  • Give learners experience of manipulating data presented and stored in different formats (e.g. JSON, XML, CSV)
  • Build familiarity with using, accessing and querying data in different database storage systems (e.g. relational database systems)
  • Understand and practice data wrangling, transformation and cleaning strategies to make data usable for analytic purposes
  • Produce data visualisations to explore and present data
  • To develop 'algorithmic thinking' and problem solving strategies

Teaching and learning methods

The unit is delivered in a blended format. Self-directed learning material is delivered through interactive digital (Jupyter) notebooks to impart core knowledge and skills with weekly online synchronous sessions allowing learners to work in teams to solve and practice coding problems. This is further supported by two face-to-face hackathon events allowing learners to load, combine and visualise data in the first hackathon. Building on this in the second hackathon, learners apply methods to clean and process data in order to carry out subsequent analysis.

Knowledge and understanding

Upon completion, students should be able to: 

LO1: Demonstrate a critical understanding of the fundamental principles and concepts of programming (e.g. iteration, selection, control flow and data representation using data-frames) using a modern programming language for data science (e.g. Python) 

LO2: Identify and explain key modules essential for health data science (e.g. NumPy, Pandas, re, and Matplotlib), with a focus on data-frame manipulation 

LO3: Describe basic statistical concepts and their application in health data analysis, and how they are represented/used within a programming language

Intellectual skills

Upon completion, students should be able to: 

LO4: Apply 'algorithmic thinking' to solve problems using programming concepts (e.g. selection, iteration, functions, etc.) 

LO5: Analyse and manipulate health-related datasets using data-frames, including indexing, filtering, aggregation, and integration with SQL databases 

LO6: Interpret and visualise data effectively for insights and decision-making in health-related contexts, using common visualisation libraries

Practical skills

Upon completion, students should be able to: 

LO7: Write code for basic statistical analysis and data visualisation tasks, incorporating data-frame operations 

LO8: Clean and pre-process health-related datasets using regular expressions and data-frame methods 

LO9: Write and execute queries in SQL and integrate with code 

LO10: Implement best practices in coding to structure and document code (e.g. use of data structures, functions, classes, comments, code standards such as PEP8)

Transferable skills and personal qualities

Upon completion, students should be able to: 

LO11: Experience 'team science' to solve problems collaboratively 

LO12: Develop an analytical problem solving mind-set

Assessment methods

Assessment taskLengthWeighting within unit
7-minute individual viva based on data report. Students are expected to submit a short report that details how they follow the stages of loading, processing and reporting on a health dataset. N/A100%

Feedback methods

Feedback will be provided via Blackboard 15 working days after submission.

Recommended reading

  • Dawson, M (2010) Python Programming 3rd Ed. Australia: Course Technology PTR
  • Molin, S (2019) Hands-On Data Analysis with Pandas. Birmingham: Packt
  • Medium (2024) Towards data science: A Medium publication sharing concepts, ideas and codes. https://towardsdatascience.com/about 
  • NHS (2024) NHS Python Community. https://nhs-pycom.net/ 

Study hours

Scheduled activity hours
Lectures 8
Practical classes & workshops 16
Independent study hours
Independent study 126

Teaching staff

Staff member Role
Ali Sarrami Foroushani Unit coordinator

Return to course details