Course unit details:
Programming for Health Data Science
Unit code | IIDS69061 |
---|---|
Credit rating | 15 |
Unit level | FHEQ level 7 – master's degree or fourth year of an integrated master's degree |
Teaching period(s) | Semester 1 |
Available as a free choice unit? | No |
Overview
- Fundamental data types and structures
- Core programming concepts such as iteration, selection, file handling and control flow
- Develop awareness of programming languages used in Data Science (e.g. Python, R, Julia)
- The fundamentals of using a modern programming language (e.g. Python) for data science and associated commonly used libraries/modules
- How data is stored and accessed using database systems
- Data manipulation and pre-processing (cleaning, imputation, encoding and transforming data)
- Combining datasets (data linkage)
Aims
The unit aims to:
- Facilitate the practice of writing code scripts in a modern programming language suitable for data science tasks, while applying best software engineering practices & standards
- Give learners experience of manipulating data presented and stored in different formats (e.g. JSON, XML, CSV)
- Build familiarity with using, accessing and querying data in different database storage systems (e.g. relational database systems)
- Understand and practice data wrangling, transformation and cleaning strategies to make data usable for analytic purposes
- Produce data visualisations to explore and present data
- To develop 'algorithmic thinking' and problem solving strategies
Teaching and learning methods
The unit is delivered in a blended format. Self-directed learning material is delivered through interactive digital (Jupyter) notebooks to impart core knowledge and skills with weekly online synchronous sessions allowing learners to work in teams to solve and practice coding problems. This is further supported by two face-to-face hackathon events allowing learners to load, combine and visualise data in the first hackathon. Building on this in the second hackathon, learners apply methods to clean and process data in order to carry out subsequent analysis.
Knowledge and understanding
Upon completion, students should be able to:
LO1: Demonstrate a critical understanding of the fundamental principles and concepts of programming (e.g. iteration, selection, control flow and data representation using data-frames) using a modern programming language for data science (e.g. Python)
LO2: Identify and explain key modules essential for health data science (e.g. NumPy, Pandas, re, and Matplotlib), with a focus on data-frame manipulation
LO3: Describe basic statistical concepts and their application in health data analysis, and how they are represented/used within a programming language
Intellectual skills
Upon completion, students should be able to:
LO4: Apply 'algorithmic thinking' to solve problems using programming concepts (e.g. selection, iteration, functions, etc.)
LO5: Analyse and manipulate health-related datasets using data-frames, including indexing, filtering, aggregation, and integration with SQL databases
LO6: Interpret and visualise data effectively for insights and decision-making in health-related contexts, using common visualisation libraries
Practical skills
Upon completion, students should be able to:
LO7: Write code for basic statistical analysis and data visualisation tasks, incorporating data-frame operations
LO8: Clean and pre-process health-related datasets using regular expressions and data-frame methods
LO9: Write and execute queries in SQL and integrate with code
LO10: Implement best practices in coding to structure and document code (e.g. use of data structures, functions, classes, comments, code standards such as PEP8)
Transferable skills and personal qualities
Upon completion, students should be able to:
LO11: Experience 'team science' to solve problems collaboratively
LO12: Develop an analytical problem solving mind-set
Assessment methods
Assessment task | Length | Weighting within unit |
7-minute individual viva based on data report. Students are expected to submit a short report that details how they follow the stages of loading, processing and reporting on a health dataset. | N/A | 100% |
Feedback methods
Feedback will be provided via Blackboard 15 working days after submission.
Recommended reading
- Dawson, M (2010) Python Programming 3rd Ed. Australia: Course Technology PTR
- Molin, S (2019) Hands-On Data Analysis with Pandas. Birmingham: Packt
- Medium (2024) Towards data science: A Medium publication sharing concepts, ideas and codes. https://towardsdatascience.com/about
- NHS (2024) NHS Python Community. https://nhs-pycom.net/
Study hours
Scheduled activity hours | |
---|---|
Lectures | 8 |
Practical classes & workshops | 16 |
Independent study hours | |
---|---|
Independent study | 126 |
Teaching staff
Staff member | Role |
---|---|
Ali Sarrami Foroushani | Unit coordinator |