BA English Language and English Literature / Course details

Year of entry: 2021

Coronavirus information for applicants and offer-holders

We understand that prospective students and offer-holders may have concerns about the ongoing coronavirus outbreak. The University is following the advice from Universities UK, Public Health England and the Foreign and Commonwealth Office.

Read our latest coronavirus information

Course unit details:
Computational Linguistics

Unit code LELA32051
Credit rating 20
Unit level Level 3
Teaching period(s) Semester 1
Offered by Linguistics & English Language
Available as a free choice unit? Yes

Overview

The last two decades have seen an explosion in the use of language technologies - from consumer applications such as Alexa and Google Translate to the behind-the-scenes use by, for example, social media, news and marketing companies. This course unit will provide an introduction to the field of computer natural language processing (NLP). It will focus on technologies for representing word meaning, performing syntactic analysis of sentences, composing sentence meanings, translating between languages and conducting human-machine conversation. We will consider ways in which linguistic theory is useful in performing each of these tasks, and conversely how decades of experience in building such systems can inform linguistic theory. Students will acquire a basic knowledge of the Python programming language, and gain experience of building the kinds of models that are deployed in real-world technologies. No prior programming experience is required.

Pre/co-requisites

A foundational unit in morphology and syntax, e.g. LELA10301 English Word and Sentence Structure is recommended.

Aims

The principal aims of the course unit are to:

  • Familiarize students with different approaches to the computer processing of human language
  • Enable students to decide which technologies to apply to novel NLP challenges
  • Give students experience of building, running and interpreting the performance of programs
  • Encourage students to apply insights gained from the computer processing of natural language to their analysis of linguistic data and development of linguistic theory

Learning outcomes

 

    Knowledge and understanding

    Students who successfully complete this course will acquire an understanding of:

    • Probabilistic approaches to language
    • Two core types of machine learning (supervised learning, unsupervised learning)
    • Five key areas of NLP (vector space and embedding representations of word meaning, part of speech tagging and parsing, neural sequence models, machine translation and dialogue systems)
    • Fundamentals of computer text processing (file handling, tokenisation and normalisation, regular expressions) and powerful NLP/machine learning packages

    Intellectual skills

    Students who successfully complete this course will develop and demonstrate skills in:

    • Adapting theories and intuitions to messy real-world data
    • Scaling up theories and intuitions to big data
    • Thinking formally about uncertainty and ambiguity
    • Developing and analysing formal algorithms and procedure

    Practical skills

    Students who successfully complete this course will develop and demonstrate the ability to:

    • Perform the simple text processing tasks (input/output, tokenisation, normalisation) needed to make use of NLP and machine learning packages
    • Run computational linguistic experiments in provided Python notebooks
    • Interpret and report on the performance of natural language processing systems
    • Decide which algorithms to deploy for a new NLP problem

    Transferable skills and personal qualities

    Students who successfully complete this course will develop and demonstrate the ability to:

    • Organise and access data in the cloud
    • Run programs in the cloud - Apply training to an unfamiliar domain
    • Deal with points of incompatibility between their prior assumptions and data

    Assessment methods

    Exam 50%
    Coursework 50%
    Mock exam N/A (Formative)

     

    Feedback methods

    Written and oral feedback on coursework report Summative
    Written and oral feedback on exam Summative
    Oral feedback on mock exam Formative

     

    Recommended reading

    Bird, S., Klein, E., & Loper, E. (2009). Chapter 5: Categorizing and Tagging Words and Chapter 8: Analyzing Sentence Structure. Natural language processing with Python: analyzing text with the natural language toolkit. O'Reilly. http://www.nltk.org/book/ch05.html and http://www.nltk.org/book/ch08.html

    Elman, J. L. (1990). Finding structure in time. Cognitive science, 14(2), 179-211.

    Jurafsky, D. and J. H. Martin (2020), Chapter 6: Vector Semantics and Embeddings and Chapter 24: Chatbots and Dialogue Systems. Speech and language processing: An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition. 3rd Edition. Prentice-Hall. https://web.stanford.edu/~jurafsky/slp3/6.pdf and https://web.stanford.edu/~jurafsky/slp3/24.pdf

    Ritter, A., Cherry, C., & Dolan, W. B. (2011). Data-driven response generation in social media. In Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing (pp. 583-593). https://www.aclweb.org/anthology/D11-1054/

    Weizenbaum, Joseph (1966). ELIZA--A Computer Program for the Study of Natural Language Communication Between Man and Machine. Communications of the ACM. 9: 36–35.

    Study hours

    Scheduled activity hours
    Lectures 11
    Seminars 22
    Independent study hours
    Independent study 167

    Teaching staff

    Staff member Role
    Colin James Bannard Unit coordinator

    Return to course details