Course unit details:
Data Science and Machine Learning 1
Unit code | ECON61351 |
---|---|
Credit rating | 15 |
Unit level | FHEQ level 7 – master's degree or fourth year of an integrated master's degree |
Teaching period(s) | Full year |
Available as a free choice unit? | No |
Overview
This unit is the first of a sequence of two units (ECON6xxx2 being the follow-on unit) which will help students on the MSc Economics and Data Science in the development of vital study, employability, and programming skills. This course introduces students to core data science and machine learning methods for the analysis of economic data. This course supports students in their development of a comprehensive understanding of both the statistical theory behind the methods and the practical issues surrounding their implementation in the computer language R. Throughout the unit, and through the work described above, students will be supported in developing vital employability skills, such as communicating results to a variety of audiences.
Aims
Provide a working knowledge of the theory underlying machine learning methods.
Provide in conjunction with ECON62020 “Programming and other Skills for Data Scientists” (which runs parallel to ECON61351) the opportunity to engage in directed work that implements econometric and data science methods.
Provide the opportunity to develop skills that are vital for advanced study and employability in the fields of economics and data science.
Learning outcomes
In order to be able to take up positions in government, central banks or private sector organisations as a data analyst/economist students will have to be able to demonstrate strong skills in the areas supported by this unit:
statistical methods for data-scientific analysis such as machine learning
the mathematical theory behind data-scientific methods
the implementation and interpretation of empirical data-scientific analysis of economics data
communication (written)
Syllabus
Provisional
Intro to Statistical learning / predictive methods
Supervised learning: Linear regression, k-nearest neighbor
Regularization, shrinkage (ridge, lasso)
Regularization, shrinkage (ridge, lasso)
Unsupervised learning: PCA
Basics of binary classification methods
Logistic regression, naive Bayes, linear and nonlinear SVMs
Bagging, Boosting, Tree based methods, random forest
Multi variate multinomial logit/multi class classifiers (link to demand models)
Teaching and learning methods
Student work will be organised around lectures and tutorial sessions. The latter is centred around problem sets and empirical exercises, and during the sessions students finalise or continue work prepared asynchronously.
Any learning materials required will be delivered through the unit’s Blackboard site.
Lecture attendance: 22h (11 weeks x 2h allowing for start in week 2 of term)
Tutorial Attendance: 11h (11 x 1h)
Preparation and consolidation work for lecture material: 70h
Preparation and consolidation work for tutorial material: 20h
Revision for assessments: 27h
Sum: 150h
Assessment methods
Individual empirical project (IEP), 1000 words, 30%
Midterm (empirical – students are expected to complete a short empirical investigation within an allocated amount of time, MT), 500 words, 20%
Final exam (EX), 1.5h, 50%
Recommended reading
Gareth James, Daniela Witten, Trevor Hastie, Robert Tibshirani (2021) An Introduction to Statistical Learning with Applications in R, Springer Texts in Statistics, New York, USA.
Study hours
Scheduled activity hours | |
---|---|
Lectures | 22 |
Tutorials | 11 |
Teaching staff
Staff member | Role |
---|---|
Alastair Hall | Unit coordinator |