- UCAS course code
- UCAS institution code
BA English Language and Japanese / Course details
Year of entry: 2023
- View tabs
- View full page
Course unit details:
From Text to Linguistic Evidence
|Unit level||Level 1|
|Teaching period(s)||Semester 1|
|Available as a free choice unit?||Yes|
The study of large amounts of texts allows us to generate linguistic evidence based on language use, rather than on the linguist’s intuitions or prescribed ideas. Linguists create large collections of naturally occurring examples of language stored electronically, which are called ‘corpora’ (singular ‘corpus’). Corpora can include written texts of different genres such as fiction, news, material from the internet, as well as transcriptions of spoken language, etc. Corpus methods are applied in data analysis in linguistics and beyond including in social sciences, law, education, and even in health sciences (e.g. for the analysis and identification of mood disorders), and of course in tech and other data-based industries. In this module, we will focus on the large variety of corpora of English.
This unit provides a theoretical and practical introduction to corpus linguistics. You will study how corpora are designed, categorised and further annotated, and get an overview of the corpora available to study the English language. You will learn what a good corpus linguistic study involves and how to do one yourself. To this end, you will receive a basic training in the use of specialist software such as BNCWeb, Sketch Engine, AntConc and R. You will learn corpus tools and techniques used to study a variety of linguistic questions, and come to understand how corpus methods can be applied in a variety of linguistic disciplines such as morphosyntax, semantics, pragmatics, sociolinguistics and historical linguistics.
This course is a possible pre-requisite for LELA31632 Forensic Linguistics
The module aims to:
- Provide an introduction to corpus linguistics;
- Provide students with a good understanding of corpus design, annotation and corpus methods;
- Familiarise students with major corpus resources, tools and techniques for studying English;
- Teach students the practical skills to use these tools and perform a variety of corpus analyses;
- Develop a critical awareness of which corpus and which corpus tool can be used to answer a certain linguistic question;
- Develop a critical attitude towards the strengths and weaknesses of corpus research.
These are examples of topics that will be covered in the lectures and seminars:
- What is a corpus
- History of corpus linguistics
- Overview over available corpora for the English language
- Corpus-based research design, including turning research questions into searchable queries, selecting the right corpus and the right corpus techniques for specific questions, and the scientific method
- Data collection from corpora, including the generation of KWIC concordances and word lists
- Investigating differences in linguistic behaviour between two groups, the concepts of statistical significance and effect size, chi square test, odds ratio
- Collocations and related techniques
- Corpus annotation techniques, including part-of-Speech (POS) tagging
Teaching and learning methods
One 1-hour lecture per week
A total of 2 -hours of (computer) seminars per week
Optional individual consultation sessions
Lecture and supporting materials will be made available on Blackboard.
Knowledge and understanding
By the end of this course students will:
- Understand what corpora are and how they are designed;
- Be familiar with and able to apply a variety of corpus methods and techniques;
- Have a good knowledge of the range of corpora and corpus tools available for the study of English.
By the end of this course students will be able to:
- Critically evaluate the design of a particular corpus;
- Assess the strengths and weaknesses of a corpus approach to a certain problem;
- Decide which corpus, corpus tool and technique to use to investigate a particular question;
- Formulate research questions that are amenable to corpus research.
By the end of this course students will be able to:
- Carry out linguistic investigations using a variety of corpora and corpus tools;
- Use software to produce concordances, to create word lists, to perform key word analyses, etc.;
- Perform simple statistical tests;
- Design a corpus study;
- Confidently explore unknown corpora and tools, including those for other languages than English.
Transferable skills and personal qualities
By the end of this course students will have developed:
- Advanced problem solving skills;
- New IT skills;
- Confidence in working with new resources and techniques;
- Essay writing skills;
- Critical attitude towards research methods.
Formative or Summative
Mock Exam Questions
Problem set on miscellaneous tasks on corpus design and methodology
Formative and Summative
Formative or summative
Personalized written feedback from course instructors on all submitted assignments.
Feedback from instructors during seminars and on discussion fora
The main readings will be taken from:
- Hoffmann, S., Evert, S., Smith, N. and Lee, D. (2008) Corpus Linguistics with BNCweb: A Practical Guide. Frankfurt am Main: Peter Lang.
- McEnery, T., and Harie. A. (2012) Corpus Linguistics. Cambridge: Cambridge University Press.
- McEnery, T., Xiao, R., Tono, Yuki. (2006) Corpus-Based Language Studies. An Advanced Resource Book. London: Routledge.
Additional (suggested) readings will be provided week by week when necessary.
|Scheduled activity hours|
|Assessment written exam||2|
|Independent study hours|
|Richard Zimmermann||Unit coordinator|