BA Film Studies and English Language

Year of entry: 2022

Course unit details:
From Text to Linguistic Evidence

Unit code LELA10401
Credit rating 20
Unit level Level 1
Teaching period(s) Semester 1
Offered by Linguistics & English Language
Available as a free choice unit? Yes


The study of large amounts of texts allows us to generate linguistic evidence based on language use, rather than on the linguist’s intuitions or prescribed ideas. Linguists create large collections of naturally occurring examples of language stored electronically, which are called ‘corpora’ (singular ‘corpus’). Corpora can include written texts of different genres such as fiction, news, material from the internet, as well as transcriptions of spoken language, etc. Corpus methods are applied in data analysis in linguistics and beyond including in social sciences, law, education, and even in health sciences (e.g. for the analysis and identification of mood disorders), and of course in tech and other data-based industry. In this module, we will focus on the large variety of corpora of English.
This unit provides a theoretical and practical introduction to corpus linguistics. You will study how corpora are designed, categorised and further annotated, and get an overview of the corpora available to study the English language. You will learn what a good corpus linguistic study involves and how to do one yourself. To this end, you will receive a basic training in the use of specialist software such as BNCWeb, Sketch Engine, AntConc and R. You will learn corpus tools and techniques used to study a variety of linguistic questions, and come to understand how corpus methods can be applied in a variety of linguistic disciplines such asmorphosyntax, semantics, pragmatics, sociolinguistics and historical linguistics.
This course is a possible pre-requisite for LELA31632 Forensic Linguistics.


The module aims to:

  • provide an introduction to corpus linguistics;
  • provide students with a good understanding of corpus design, annotation and corpus methods;
  • familiarise students with major corpus resources, tools and techniques for studying English;
  • teach students the practical skills to use these tools and perform a variety of corpus analyses;
  • develop a critical awareness of which corpus and which corpus tool can be used to answer a certain linguistic question;
  • develop a critical attitude towards the strengths and weaknesses of corpus research.

Knowledge and understanding

By the end of this course students will:

  • understand what corpora are and how they are designed;
  • be familiar with and able to apply a variety of corpus methods and techniques;
  • have a good knowledge of the range of corpora and corpus tools available for the study of English.

Intellectual skills

By the end of this course students will be able to:

  • critically evaluate the design of a particular corpus;
  • assess the strengths and weaknesses of a corpus approach to a certain problem;
  • decide which corpus, corpus tool and technique to use to investigate a particular question;
  • formulate research questions that are amenable to corpus research.

Practical skills

By the end of this course students will be able to:
  • carry out linguistic investigations using a variety of corpora and corpus tools;
  • use software to produce concordances, to create word lists, to perform key word analyses, etc.;
  • perform simple statistical tests;
  • design a corpus study;
  • confidently explore unknown corpora and tools, including those for other languages than English.

Transferable skills and personal qualities

By the end of this course students will have developed:

  • advanced problem solving skills;
  • new IT skills;
  • confidence in working with new resources and techniques;
  • essay writing skills;
  • critical attitude towards research methods.

Assessment methods

2-part practical exercise on (a) construction of a spoken corpus and (b) test for independence of two variables 30%
Mock exam N/A
Exam 70%


Feedback methods

Feedback method

Formative or summative

Personalized written feedback from course instructors on all submitted assignments.


Feedback from instructors during seminars and on discussion fora



Recommended reading

The main readings will be taken from:

  • Hoffmann, S., Evert, S., Smith, N. and Lee, D. (2008) Corpus Linguistics with BNCweb: A Practical Guide. Frankfurt am Main: Peter Lang.
  • McEnery, T., and Harie. A. (2012) Corpus Linguistics. Cambridge: Cambridge University Press.
  • McEnery, T., Xiao, R., Tono, Yuki. (2006) Corpus-Based Language Studies. An Advanced Resource Book. London: Routledge.

Additional (suggested) readings will be provided week by week when necessary.

Study hours

Scheduled activity hours
Assessment written exam 2
Lectures 11
Seminars 22
Independent study hours
Independent study 165

Teaching staff

Staff member Role
Richard Zimmermann Unit coordinator

Return to course details