Clinical Data Engineering CPD / Course details

Year of entry: 2022

Coronavirus information for applicants and offer-holders

We understand that prospective students and offer-holders may have concerns about the ongoing coronavirus outbreak. The University is following the advice from Universities UK, Public Health England and the Foreign and Commonwealth Office.

Read our latest coronavirus information

Course description

Data scientists are able to ask new questions of data by running analysis methods. This data can be collected and combined from different sources and in different formats.

A large part of data science is the data pre-processing (data wrangling) that first takes place to clean and transform data into a format that can be easily analysed. This can involve creating pipelines and infrastructure to store and process data.

Our introductory Clinical Data Engineering CPD unit aims to introduce concepts of data storage (databases), data access, cleaning/preparing data, and data security and governance.

This unit is suitable for clinical, allied health professionals and others who want to learn more about how they can make use of their data.

This unit will cover:

  • fundamental data types and structures
  • structured and unstructured data
  • the fundamentals of coding for data science
  • how data is modelled in different database systems
  • querying, filtering and cleaning data
  • representing data using data frames
  • data transformation
  • data governance and security.

Course dates

  •  3 October to 2 December 2022 (part-time)

Time commitment:

  •  An online introductory webinar is scheduled for Week 1 on Monday 3 October 2022, 3pm-4pm. Students can either join at the time of the session or watch the recording later, in their own time.
  • An optional Hackathon event in Week 7 on Wednesday 16 November 2022 (time tbc). This event will allow participants to practice some data engineering by working with clinical data sets for analysis and creating a data pipeline. We will be using a hybrid approach for this day event, which means students will be able to join us either face-to-face in Manchester or online.

Other than these planned events, the course is mainly delivered online, with self-directed learning materials that can be accessed at any time.

We recommend that participants spend between 10-15 hours a week on independent learning and relevant weekly tasks.  


On successful completion of the course, you will receive 15 postgraduate credits that (subject to programme approval) can count towards a PGCert in Clinical Data Science. The PGCert course is expected to start in September 2023, pending approval.


The unit aims to give you hands-on experience of:

  • applying tools and techniques used to access data in different common formats;
  • transforming and combining this data into a format suitable for subsequent data analysis (such as application of statistical methods/machine learning algorithms) by creating data processing pipelines;
  • using, accessing and querying data in different database storage systems (such as relational and NoSQL databases).

The unit will also:

  • introduce the importance of data security issues both from a technical and legislative perspective;
  • explore the benefits and challenges with accessing and working with health/clinical data.

Special features

Co-designed course content

The course has been co-designed with end users and other stakeholders (including patients) to ensure that it is of real value to working professionals in health and social care.

We have partnered with leading organisations in health education and care, including the National School of Healthcare Science, Health Education England and The Christie NHS Foundation Trust.

Hackathon event

The course features a hackathon that allows you to practice some data engineering by working with clinical data sets for analysis and creating a data pipeline, which will take place on Wednesday 16 November 2022.

Employability focus

Unit assessments have been designed around providing workplace value.

Data platform-based learning

The course makes use of a learning environment that is also a data platform, allowing access and tools to work with and learn from data, including interactive digital notebooks.

Learn from experts

You will learn from experts who have clinical as well as industry experience working in healthcare, data science and data engineering.

Teaching and learning

The course is mainly delivered online, with self-directed learning materials that can be accessed at any time.

This is also supported by synchronous webinars, forums and digital communication platforms that help students to build an active learning community and benefit from networking. This unit will have one synchronous webinar in week 1, taking place on Monday 3 October 2022, 3pm-4pm via Zoom.

There is also an optional face-to-face day allowing participants to visit and make use of the university campus and equipment, as well as to meet and get to know their fellow students in person.

Sessions are recorded so that students who cannot make synchronous or face-to-face sessions are still able to view any sessions they miss.

Coursework and assessment

This unit is assessed entirely through coursework. You will construct a Data Management Plan (DMP) detailing how a chosen dataset will be stored and processed, along with details of the characteristics of the chosen dataset.


The University of Manchester offers extensive library and online services to help you get the most out of your studies.

Disability support

Practical support and advice for current students and applicants is available from the Disability Advisory and Support Service. Email: