MRes Criminology (Social Statistics) / Course details

Year of entry: 2024

Course unit details:
Introduction to Statistical Modelling

Course unit fact file
Unit code SOST70011
Credit rating 15
Unit level FHEQ level 7 – master's degree or fourth year of an integrated master's degree
Teaching period(s) Semester 1
Available as a free choice unit? Yes


Many if not most social research questions are concerned with questions of causality, e.g. what are the causes of good and bad things in society? Only if we understand the causes can we hope to modify the good/bad effects.

Much if not most of social research is observational, i.e. correlational; we can observe and measure things, ask people questions etc., but it’s not easy to run experiments. This means that often we only have correlational data with which to evaluate and test our causal research questions.

Taken together, the two conditions above present a problem, because as we all know, correlation does not equal causation.

Recently developed theories of causation challenge these limitations. We will use the theory of Directed Acyclic Graphs (DAGs) to understand how causality translates into correlations among variables. We will use this knowledge to help us specify statistical models that may help us evaluate our causal theories.

By the end of the course, students should be able to formulate and understand Directed Acyclic Graphs (DAGs), which represent hypothesised causal relationships among phenomena.

Students will then be able to use the DAGs to evaluate which predictor variables they need to include in statistical models designed to answer causal research questions.

The students will be able to use the R software package to estimate the general linear model (a.k.a. linear regression) and one variety of generalized linear model, binary logistic regression. Students will also be able to fit and interpret simple multilevel models, namely random intercept linear mixed (hierarchical) models.

The students will be able to use their knowledge of DAGs and the results of the statistical models to answer causal social research questions.


The course unit aims to: Give students a basic understanding of causal theory as applied to directed acyclic graphs, which allows them to specify theory-driven statistical models to estimate the causal effects of a target predictor variable on an outcome variable.

Learning outcomes

Apply casual reasoning to directed acyclic graphs, to enable them to identify and distinguish between causal and non-causal/spurious/backdoor paths. Identify which variables in the backdoor paths need to be controlled for, to isolate the causal effect.

Specify and fit statistical models to estimate the causal effects in a sample of data, using R software.

Understand and critically evaluate the results of the model, in terms of inferences to the population of interest

Teaching and learning methods

Textbooks and scholarly articles.

Short and extended video presentations of course material and other relevant resources.

On-line, interactive sessions, setting students problems and discussing solutions, using video-conferencing software.

On-line, interactive practicals using R software, using video-conferencing software.

Blackboard VLE for distributing, sharing and discussing learning materials.

Knowledge and understanding

Causal reasoning on directed acyclic graphs;

General linear models; generalized linear models, specifically binary logistic regression; linear mixed effects (multilevel) models.

Intellectual skills

Hypothesis testing; Critical application of causal reasoning to hypothesis-driven research.

Practical skills

Using R software to specify and fit general and generalized linear models.

Transferable skills and personal qualities

Logical reasoning and argument; use of probabilities to make inference.

Assessment methods

0% Formative assignment for understanding causality on DAGs (up to 300 words)

25% Understanding causality on DAGs assignment (multiple-choice test equivalent to a half-hour exam)

0% Formative assignment on interpretation of linear model results (up to 300 words)

25% Model results interpretation assignment (multiple-choice test equivalent to a half-hour exam)

0% Formative assignment on model building for causal hypothesis testing in R (a short coding assignment, equivalent to up to 300 words)

50% Model building for causal hypothesis testing assessment, using a 1,500 word report on an analysis in R conducted by the student.

Feedback methods

Feedback on formative assignments available through assessment answers and office hours. Feedback on assessed work available through assessment answers and discussion session. Individual feedback on the final reports.

Recommended reading

DiPrete & Forristal (1994). Multilevel Models: Methods and Substance. Annual Review of Sociology, 20:331-357.

Elwert, F., & Winship, C. (2014). Endogenous Selection Bias: The Problem of Conditioning on a Collider Variable. Annual Review of Sociology, 40(1), 31–53.

Hox, J. J. (2002). Multilevel Analysis : Techniques and Applications. Erlbaum.

Imai, K. (2018). Quantitative Social Science: An Introduction. Princeton University Press.

McShane, B. B., Gal, D., Gelman, A., Robert, C., & Tackett, J. L. (2019). Abandon Statistical Significance. American Statistician, 73(sup1), 235–245.

Pearl, J., Glymour, M., and Jewell, N.P. (2016). Causal Inference in Statistics. Wiley.

Rohrer, J. M. (2018). Thinking Clearly About Correlations and Causation: Graphical Causal Models for Observational Data. Advances in Methods and Practices in Psychological Science, 1(1), 27-42. doi:10.1177/2515245917745629

Verzani, J. (2001). SimpleR: Using R for Introductory Statistics.

Study hours

Scheduled activity hours
Lectures 20

Teaching staff

Staff member Role
Nicholas Shryane Unit coordinator

Additional notes

Independent study hours

Private Study 80

Directed Reading 50

Additional notes

Compulsory for SRMS

Pre-Requisite for CSDA, SEM and LDA

Return to course details