Apply through UCAS
- UCAS course code
- G400
- UCAS institution code
- M20
Course unit details:
Data Science
Unit code | COMP13212 |
---|---|
Credit rating | 10 |
Unit level | Level 1 |
Teaching period(s) | Semester 2 |
Available as a free choice unit? | Yes |
Overview
This course unit has two objectives. The first is to introduce the student to a range of fundamental, non-trivial algotithms, and to the techniques required to analyse their correctness and running-time.
The second is to present a conceptual framework for analysing the intrinsic complexity of computational problems, which abstracts away from details of particular algorithms.
Aims
• To give students awareness of the elements of the “The Data Science Process”. Many of the elements of this process will be studied in finer detail, although not all.
• To give students practice using python tools for data processing and analysis, through practical computer laboratory exercises. Tools include: numpy, scipy.stats, pandas, and Jupyter notebooks,
• To demonstrate methods for exploring and visualising data, and give students practice in using these methods.
• To give students understanding of uncertainty in data, in particular, methods for measuring uncertainty, and when to use appropriate measures.
• To give students an introduction to statistical thinking and Bayesian reasoning.
• To give students an introduction to ethical considerations when analysing data and drawing responsible conclusions.
• To give a brief introduction to concepts from machine learning, including: classification/regression, overfitting/underfitting, the need for independent testing data, and cross-validation, including leave-one-out validation.
• To demonstrate some practical application of basic machine learning methods, including the Bayesian classifier, the naive Bayes classification, linear regression, and logistic regression.
Learning outcomes
- Demonstrate awareness of the “Data Science Process” by describing qualitatively how it would apply in a given situation.
- Demonstrate awareness of need for data cleaning descriptively and by doing elementary data cleaning and preparation in the laboratory.
- Demonstrate ability to measure and express uncertainty from a set of data and quantities derived from that data.
- Demonstrate ability to choose and build appropriate models of different datasets.
- Demonstrate ability to evaluate the quality of a model of a dataset.
- Demonstrate the ability compare different models of a dataset and models of different dataset in order to draw statistically sound conclusions about hypotheses or claims from the data.
- Demonstrate ability to use python tools to: read and write data sets to and from files, produce descriptive statistics and draw conclusions from these, produce graphical visualisation and draw conclusions, perform basic statistical tests including the difference between means, and perform a simple machine learning experiment by building an email spam filter using a naive Bayes classifier.
Teaching and learning methods
Lectures and coursework reported via Jupyter notebooks in Python.
Assessment methods
Method | Weight |
---|---|
Written exam | 80% |
Practical skills assessment | 20% |
Recommended reading
To be determined
Study hours
Scheduled activity hours | |
---|---|
Lectures | 22 |
Practical classes & workshops | 12 |
Independent study hours | |
---|---|
Independent study | 66 |
Teaching staff
Staff member | Role |
---|---|
Ainur Begalinova | Unit coordinator |