.jpg)
Course unit details:
Data Engineering Concepts
Unit code | COMP63301 |
---|---|
Credit rating | 15 |
Unit level | FHEQ level 7 – master's degree or fourth year of an integrated master's degree |
Teaching period(s) | Semester 1 |
Available as a free choice unit? | Yes |
Overview
Data Engineering plays a crucial role in enabling organisations to leverage big data for insightful analytics, driving business strategies and innovations. The field has evolved significantly from its early days of simple database management to encompass advanced technologies for large-scale data processing and analytics. This evolution reflects the growing complexity and volume of data, as well as the need for robust data infrastructures to support AI systems. As AI continues to dominate various industries, the relevance of data engineering in the job market has surged. Data engineers are essential for designing, building, and maintaining the data pipelines that AI systems depend on. Consequently, there is increased demand for professionals with analytical thinking, innovation, and problem-solving skills in data engineering, which represent the objectives of this course unit.
Pre/co-requisites
Data Management, including traditional (e.g., CSV, relational, etc.) and non-traditional (JSON, text, noSQL, etc.) data types and associated data management technologies.
Programming in Python.
SQL and the Relational Algebra.
Data Analytics.
Aims
The unit aims to provide students with an understanding of the concepts that underpin data engineering and the experience of applying those concepts. In turn, data engineering provides processes and mechanisms that enable value to be obtained from data. These processes and mechanisms can be considered to give rise to a data engineering lifecycle, and this unit explores the concepts that underpin the different stages in such a lifecycle, which include data transformation and visualisation.
Learning outcomes
1. Explain the Data Engineering (DE) lifecycle, related concepts, challenges and research questions.
2. Identify relevant data properties, understanding the shape of data and its representation of the world.
3. Apply selected DE techniques for data integration, cleaning, transformation and visualisation, ensuring data quality for the purpose of data analysis.
4. Critically analyse data engineering technologies.
5. Discuss trade-offs between various design options.
Syllabus
- Data Acquisition and Reduction
- Understanding the shape of data
- Data Modelling and Storage Considering Traditional and Non-Traditional Data Types
- Data Integration
- Data Profiling, Quality and Cleaning
- Data Dissemination and Security
- Data Querying
- Data Analytics through Machine and Deep Learning
- Data Visualisation and Serving
- Data Mutability/Volatility, Robustness and Trust
Teaching and learning methods
Asynchronous learning material will be made available in the form of videos and directed reading, as well as formative and normative exercises delivered via the VLE.
Synchronous activities include in-person workshops, focusing on discussion of examples, clarifications and Q&A. Labs allow for exploration of coursework, what is expected and how to go about doing it, and for receiving feedback on formative coursework.
Employability skills
- Analytical skills
- Innovation/creativity
- Problem solving
- Research
Assessment methods
Method | Weight |
---|---|
Written exam | 70% |
Practical skills assessment | 15% |
Set exercise | 15% |
Feedback methods
Cohort level feedback after marking.
Individual feedback on request in lab.
Individual feedback via rubric.
Cohort feedback in workshops.
Auto-graded quizzes providing immediate feedback.
Recommended reading
Joe Reis, Matt Housley (2022): Fundamentals of Data Engineering. O'Reilly Media. ISBN 9781098108304.
Study hours
Scheduled activity hours | |
---|---|
Assessment written exam | 2 |
Practical classes & workshops | 20 |
Work based learning | 10 |
Independent study hours | |
---|---|
Independent study | 118 |
Teaching staff
Staff member | Role |
---|---|
Sandra Sampaio | Unit coordinator |
Additional notes
Videos (~5 Hours)
Formative Quizzes (~2 Hours)
Formative Coursework (~10 Hours)
Assessed Coursework (~10 Hours)
Assessed Quizzes (~3 Hours)