
Course unit details:
Introduction to Statistical Modelling
Unit code | SOST70011 |
---|---|
Credit rating | 15 |
Unit level | FHEQ level 7 – master's degree or fourth year of an integrated master's degree |
Teaching period(s) | Semester 1 |
Offered by | Social Statistics |
Available as a free choice unit? | Yes |
Overview
Many if not most social research questions are concerned with questions of causality, e.g. what are the causes of good and bad things in society? Only if we understand the causes can we hope to modify the good/bad effects.
Much if not most of social research is observational, i.e. correlational; we can observe and measure things, ask people questions etc., but it’s not easy to run experiments. This means that often we only have correlational data with which to evaluate and test our causal research questions.
Taken together, the two conditions above present a problem, because as we all know, correlation does not equal causation.
Recently developed theories of causation challenge these limitations. We will use the theory of Directed Acyclic Graphs (DAGs) to understand how causality translates into correlations among variables. We will use this knowledge to help us specify statistical models that may help us evaluate our causal theories.
By the end of the course, students should be able to formulate and understand Directed Acyclic Graphs (DAGs), which represent hypothesised causal relationships among phenomena.
Students will then be able to use the DAGs to evaluate which predictor variables they need to include in statistical models designed to answer causal research questions.
The students will be able to use the R software package to estimate the general linear model (a.k.a. linear regression) and one variety of generalized linear model, binary logistic regression. Students will also be able to fit and interpret simple multilevel models, namely random intercept linear mixed (hierarchical) models.
The students will be able to use their knowledge of DAGs and the results of the statistical models to answer causal social research questions.
Aims
Specifically:
1) Enable students to model data from large social surveys using linear and binary logistic regression modelling, and factor analysis.
2) Enable students to use such models to carry out hypothesis testing and to make valid inferences from the survey sample to the population of interest.
3) Enable students to interpret and critically evaluate the results of such modelling and inferential analyses.
4) Provide students with the skills to use SPSS to carry out the above analyses.
Teaching and learning methods
The course will be delivered in eleven 2-hour classes consisting either of a lecture or a Q&A session followed a hands-on practical exercise. In the exercise the students will be required to carry out formative tasks designed to strengthen their understanding. Weekly back-up support will also be provided in the form of office hours. The students will be required to complete three pieces of formative homework and they will receive feedback on that work. The homework will either be in the form of structured short-answer questions requiring students to run and interpret simple analyses, or in the form of short reports on existing analyses. The latter will enable students to practice and receive feedback on the skills required for the assessment.
Knowledge and understanding
- Apply casual reasoning to DAGs, to enable them to identify and distinguish between causal and non-causal/spurious/backdoor paths.
- Identify which variables in the backdoor paths need to be controlled for, to isolate causal effects.
- Understand the meaning of parameters in general and generalized linear models, specifically linear regression and binary logistic regression.
- Understand and critically evaluate the results of such models, in terms of inferences to the population of interest.
Intellectual skills
- Understand and apply the logic of hypothesis testing.
- To apply causal reasoning to hypothesis-driven research.
Practical skills
- Use R software to specify and fit general and generalized linear models
- Use R software to visualize data and the results of such models.
Transferable skills and personal qualities
- Apply logical reasoning and evidence to address issues of causality.
- Use probabilities to make inference.
Assessment methods
Final report of up to 3000 words worth 100%. This assessment will present you with a DAG, a dataset, and a series of research questions for you to answer using the DAG, the data, and R software.
Feedback methods
Available via Turnitin
Recommended reading
Field, A. (2013). Discovering Statistics Using SPSS (2nd Ed.). London: Sage Publications.
Linneman, T. (2011). Social Statistics: The basics and beyond. Taylor & Francis.
(Linneman covers regression in much more practical detail than Field, but does not cover factor analysis.)
Study hours
Scheduled activity hours | |
---|---|
Lectures | 22 |
Independent study hours | |
---|---|
Independent study | 128 |
Teaching staff
Staff member | Role |
---|---|
Todd Hartman | Unit coordinator |
Additional notes
Compulsory for SRMS
Pre-Requisite for CSDA, SEM and LDA
Timetable
Thursday 2-4 Remote access to Simon 6.004 cluster