MSc Machine Learning

Year of entry: 2025

Course unit details:
Data Engineering Concepts

Course unit fact file
Unit code COMP63301
Credit rating 15
Unit level FHEQ level 7 – master's degree or fourth year of an integrated master's degree
Teaching period(s) Semester 1
Available as a free choice unit? Yes

Overview

Data Engineering plays a crucial role in enabling organisations to leverage big data for insightful analytics, driving business strategies and innovations. The field has evolved significantly from its early days of simple database management to encompass advanced technologies for large-scale data processing and analytics. This evolution reflects the growing complexity and volume of data, as well as the need for robust data infrastructures to support AI systems. As AI continues to dominate various industries, the relevance of data engineering in the job market has surged. Data engineers are essential for designing, building, and maintaining the data pipelines that AI systems depend on. Consequently, there is increased demand for professionals with analytical thinking, innovation, and problem-solving skills in data engineering, which represent the objectives of this course unit.

Pre/co-requisites

Data Management, including traditional (e.g., CSV, relational, etc.) and non-traditional (JSON, text, noSQL, etc.) data types and associated data management technologies.

Programming in Python.

SQL and the Relational Algebra.

Data Analytics.

Aims

The unit aims to provide students with an understanding of the concepts that underpin data engineering and the experience of applying those concepts. In turn, data engineering provides processes and mechanisms that enable value to be obtained from data. These processes and mechanisms can be considered to give rise to a data engineering lifecycle, and this unit explores the concepts that underpin the different stages in such a lifecycle, which include data transformation and visualisation.

Learning outcomes

1. Explain the Data Engineering (DE) lifecycle, related concepts, challenges and research questions.

2. Identify relevant data properties, understanding the shape of data and its representation of the world.

3. Apply selected DE techniques for data integration, cleaning, transformation and visualisation, ensuring data quality for the purpose of data analysis.

4. Critically analyse data engineering technologies.

5. Discuss trade-offs between various design options.

Syllabus

  • Data Acquisition and Reduction
  • Understanding the shape of data
  • Data Modelling and Storage Considering Traditional and Non-Traditional Data Types
  • Data Integration
  • Data Profiling, Quality and Cleaning
  • Data Dissemination and Security
  • Data Querying
  • Data Analytics through Machine and Deep Learning
  • Data Visualisation and Serving
  • Data Mutability/Volatility, Robustness and Trust

Teaching and learning methods

Asynchronous learning material will be made available in the form of videos and directed reading, as well as formative and normative exercises delivered via the VLE.


Synchronous activities include in-person workshops, focusing on discussion of examples, clarifications and Q&A. Labs allow for exploration of coursework, what is expected and how to go about doing it, and for receiving feedback on formative coursework.

Employability skills

Analytical skills
Innovation/creativity
Problem solving
Research

Assessment methods

Method Weight
Written exam 70%
Practical skills assessment 15%
Set exercise 15%

Feedback methods

Cohort level feedback after marking.

Individual feedback on request in lab.

Individual feedback via rubric.

Cohort feedback in workshops.

Auto-graded quizzes providing immediate feedback.

Recommended reading

Joe Reis, Matt Housley (2022): Fundamentals of Data Engineering. O'Reilly Media. ISBN 9781098108304.

Study hours

Scheduled activity hours
Assessment written exam 2
Practical classes & workshops 20
Work based learning 10
Independent study hours
Independent study 118

Teaching staff

Staff member Role
Sandra Sampaio Unit coordinator

Additional notes

Videos (~5 Hours)

Formative Quizzes (~2 Hours)

Formative Coursework (~10 Hours)

Assessed Coursework (~10 Hours)

Assessed Quizzes (~3 Hours)

Return to course details