Course unit details:
Foundational statistics with R
Unit code | LELA60141 |
---|---|
Credit rating | 15 |
Unit level | FHEQ level 7 – master's degree or fourth year of an integrated master's degree |
Teaching period(s) | Semester 1 |
Available as a free choice unit? | No |
Overview
This course aims to familiarize students with the basic concepts of statistics through hands-on practice and to build a foundation for more advanced studies in natural language processing. Topics covered in the course include distributions of data, basic principles of probability, describing and visualizing quantitative data, statistical modelling and interpreting quantitative data through hypothesis testing.
Aims
The unit aims to:
- Familiarize students with basic statistical concepts and terms necessary to understand and perform quantitative research
- Foster understanding of the principles of describing, visualizing, and interpreting data
- Enable students to develop R programming skills needed to work with quantitative data
- Foster organisational, evaluative and critical thinking skills necessary for conducting quantitative research
- Provide the mathematical foundations for applying regression methods in computational linguistics
Syllabus
Week 1: Variable types and introduction to R and Rstudio
Week 2: Descriptive statistics and visualisations
Week 3: Introduction to the Linear Model
Week 4: Correlation and data transformation
Week 5: Multiple regression
Week 6: Reading week
Week 7: Regression with categorical predictors
Week 8: Interactions and nonlinear effects
Week 9: Logistic regression
Week 10: Statistical Inference
Week 11: Mixed models 1
Week 12: Mixed models 2
Teaching and learning methods
Weekly 2-hour lecture (online asynchronous). These will introduce the theoretical and technical content to the topics covered in the seminars. Asynchronous delivery will allow students to cover the technical content at their own pace.
Five 2-hour synchronous seminars in computer lab. The focus will be on individual and small group computer-based activities implementing the methods described in the lecture, using R Studio. Analysis code will be provided to students will access them via Blackboard for them to run on their machine. The sessions will consist of collectively working through a series of activities, with students being able to run code provided, combined with exercises that students will complete individually or in small groups. The instructor will circulate and provide assistance as needed. On occasion the whole class will collaborate to provide a solution.
Reading assignments (beyond the main textbook), revision quizzes and additional exercises will be provided between sessions with the Blackboard Discussion Board being used for interaction between students and instructors.
Knowledge and understanding
Students will be able to:
- Demonstrate understanding of fundamentals of quantitative analysis for data analysis
- Demonstrate knowledge of basic statistical methods
- Recall key principles for effective description and visualisation of data
- Compare characteristics of basic statistical models
Intellectual skills
Students will be able to:
- Identify appropriate descriptive and data visualization methods for different types of data
- Choose the appropriate statistical model for the type of data under analysis
- Reformulate a research question into a statistical hypothesis
Practical skills
Students will be able to:
- Create visualizations and summarizations of data
- Fit a statistical model
- Write computer code in R to carry out a statistical analysis, from data description to model fitting
Transferable skills and personal qualities
Students will be able to:
- Explore and analyse quantitative data to extract information
- Draw inferences about the relationships of latent variables from quantitative data
- Generalise their quantitative analysis skills to new and unfamiliar scenarios
- Develop time management skills by working to deadline
Assessment methods
Assessment Task | Formative or Summative | Weighting |
In-class activities | Formative | 0% |
Research Report | Summative | 50% |
Exam | Summative | 50% |
Feedback methods
Research Report - Via TurnItIn 15 working days after submission
Exam - Within 15 working days after submission
Recommended reading
Winter, B. (2019). Statistics for linguists: An introduction using R. Routledge.
Dancey, C. P. & Reidy, J. (2007). Statistics without maths for psychology. Pearson Education.
Study hours
Scheduled activity hours | |
---|---|
Lectures | 22 |
Tutorials | 10 |
Independent study hours | |
---|---|
Independent study | 118 |
Teaching staff
Staff member | Role |
---|---|
Andrea Nini | Unit coordinator |
Patrycja Strycharczuk | Unit coordinator |