BA Linguistics and Italian / Course details

Year of entry: 2024

Course unit details:
From Text to Linguistic Evidence

Course unit fact file
Unit code LELA10401
Credit rating 20
Unit level Level 1
Teaching period(s) Semester 1
Available as a free choice unit? Yes

Overview

The study of large amounts of texts allows us to generate linguistic evidence based on language use, rather than on the linguist’s intuitions or prescribed ideas. Linguists create large, electronic collections of naturally occurring texts, which are called ‘corpora’ (singular ‘corpus’). Corpora can include written texts of different genres such as fiction, news, material from the internet, as well as transcriptions of spoken language, among others. Corpus methods are applied to analyse texts for research in linguistics and beyond, including in social sciences, law, education, and healthcare, tech and other data-based industries. In this module, we will focus on the large variety of corpora of English.  

You will study how corpora are designed, categorised and further annotated, and get an overview of the corpora available to study the English language. You will learn what a good corpus linguistic study involves and how to do one yourself. To this end, you will receive basic training in the use of specialist software such as BNCWeb, Sketch Engine, and AntConc . You will learn corpus tools and techniques used to study a variety of linguistic questions, and come to understand how corpus methods can be applied in a variety of linguistic disciplines such as morphosyntax, semantics, pragmatics, sociolinguistics and historical linguistics.

 

This course is a possible pre-requisite for LELA3163X Forensic Linguistics 

Aims

The module aims to:

  • Provide an introduction to corpus linguistics;
  • Provide students with a good understanding of corpus design, annotation and corpus methods;
  • Familiarise students with major corpus resources, tools and techniques for studying English;
  • Teach students the practical skills to use these tools and perform a variety of corpus analyses;
  • Develop a critical awareness of which corpus and which corpus tool can be used to answer a certain linguistic question;
  • Develop a critical attitude towards the strengths and weaknesses of corpus research.

Syllabus

These following are examples of topics that mayl be covered in the lectures and seminars:

  • What is a corpus
  • History of corpus linguistics
  • Overview over available corpora for the English language
  • Corpus-based research design, including turning research questions into searchable queries, selecting the right corpus and the right corpus techniques for specific questions, the scientific method
  • Data collection from corpora, including the generation of KWIC concordances and word lists
  • Investigating differences in linguistic behaviour between two groups, the concepts of statistical significance and effect size, chi square test, odds ratio
  • Relative frequency
  • Collocations and related techniques
  • Corpus annotation techniques, including part-of-Speech (POS) tagging

These topics are indicative only and subject to change.

Teaching and learning methods

One 1-hour lecture per week

A total of 2 -hours of (computer) seminars per week

Optional individual consultation sessions

Lecture and supporting materials will be made available on Canvas.

Knowledge and understanding

By the end of this course students will:

  • Understand what corpora are and how they are designed;
  • Be familiar with and able to apply a variety of corpus methods and techniques;
  • Have a good knowledge of the range of corpora and corpus tools available for the study of English.

Intellectual skills

By the end of this course students will be able to:

  • Critically evaluate the design of a particular corpus;
  • Assess the strengths and weaknesses of a corpus approach to a certain problem;
  • Decide which corpus, corpus tool and technique to use to investigate a particular question;
  • Formulate research questions that are amenable to corpus research.

Practical skills

By the end of this course students will be able to:

  • Carry out linguistic investigations using a variety of corpora and corpus tools;
  • Use software to produce concordances, to create word lists, to perform key word analyses, etc.;
  • Perform simple statistical tests;
  • Design a corpus study;
  • Confidently explore unknown corpora and tools, including those for other languages than English.

Transferable skills and personal qualities

By the end of this course students will have developed:

  • Advanced problem solving skills;
  • New IT skills;
  • Confidence in working with new resources and techniques;
  • Essay writing skills;
  • Critical attitude towards research methods.

Assessment methods

Assessment TaskFormative or SummativeWeighting
Midterm assignmentFormative and Summative30%
Mock ExamFormative 0%
ExamSummative70%

 

 

 

Feedback methods

Feedback method

Formative or summative

 Personalised written feedback from course   instructors on all submitted assignments. Summative
 Feedback from instructors during seminars, for the midterm assignment and in person. Formative

 

Recommended reading

The main readings will be taken from:

  • Hoffmann, S., Evert, S., Smith, N. and Lee, D. (2008) Corpus Linguistics with BNCweb: A Practical Guide. Frankfurt am Main: Peter Lang.
  • McEnery, T., and Harie. A. (2012) Corpus Linguistics. Cambridge: Cambridge University Press.
  • McEnery, T., Xiao, R., Tono, Yuki. (2006) Corpus-Based Language Studies. An Advanced Resource Book. London: Routledge.

Additional (suggested) readings will be provided week by week when necessary.

Study hours

Scheduled activity hours
Assessment written exam 2
Lectures 11
Seminars 22
Independent study hours
Independent study 165

Return to course details