In April 2016 Manchester eScholar was replaced by the University of Manchester’s new Research Information Management System, Pure. In the autumn the University’s research outputs will be available to search and browse via a new Research Portal. Until then the University’s full publication record can be accessed via a temporary portal and the old eScholar content is available to search and browse via this archive.

A Journey Across Football Modelling with Application to Algorithmic Trading

Kharrat, Tarak

[Thesis]. Manchester, UK: The University of Manchester; 2016.

Access to files

Abstract

In this thesis we study the problem of forecasting the final score of a footballmatch before the game kicks off (pre-match) and show how the derived models canbe used to make profit in an algorithmic trading (betting) strategy.The thesis consists of two main parts. The first part discusses the database anda new class of counting processes. The second part describes the football forecastingmodels.The data part discusses the details of the design, specification and data collectionof a comprehensive database containing extensive information on match resultsand events, players' skills and attributes and betting market prices. The databasewas created using state of the art web-scraping, text-processing and data-mimingtechniques. At the time of writing, we have collected data on all games played inthe five major European leagues since the 2009-2010 season and on more than 7000players.The statistical modelling part discusses forecasting models based on a newgeneration of counting process with flexible inter-arrival time distributions. Severaldifferent methods for fast computation of the associated probabilities are derived andcompared. The proposed algorithms are implemented in a contributed R packageCountr available from the Comprehensive R Archive Network.One of these flexible count distributions, the Weibull count distribution, was usedto derive our first forecasting model. Its predictive ability is compared to the modelspreviously suggested in the literature and tested in an algorithmic trading (betting)strategy. The model developed has been shown to perform rather well compared toits competitors.Our second forecasting model uses the same statistical distribution but modelsthe attack and defence strengths of each team at the players level rather than ata team level, as is systematically done in the literature. For this model we makeheavy use of the data on the players' attributes discussed in the data part of thethesis. Not only does this model turn out to have a higher predictive power but italso allows us to answer important questions about the `nature of the game' such asthe contribution of the full-backs to the attacking efforts or where would a new teamfinish in the Premier League.

Additional content not available electronically

Countr : an R package to fit flexible count regression (available from Cran)

Bibliographic metadata

Type of resource:
Content type:
Form of thesis:
Type of submission:
Degree type:
Doctor of Philosophy
Degree programme:
PhD Mathematical Sciences
Publication date:
Location:
Manchester, UK
Total pages:
124
Abstract:
In this thesis we study the problem of forecasting the final score of a footballmatch before the game kicks off (pre-match) and show how the derived models canbe used to make profit in an algorithmic trading (betting) strategy.The thesis consists of two main parts. The first part discusses the database anda new class of counting processes. The second part describes the football forecastingmodels.The data part discusses the details of the design, specification and data collectionof a comprehensive database containing extensive information on match resultsand events, players' skills and attributes and betting market prices. The databasewas created using state of the art web-scraping, text-processing and data-mimingtechniques. At the time of writing, we have collected data on all games played inthe five major European leagues since the 2009-2010 season and on more than 7000players.The statistical modelling part discusses forecasting models based on a newgeneration of counting process with flexible inter-arrival time distributions. Severaldifferent methods for fast computation of the associated probabilities are derived andcompared. The proposed algorithms are implemented in a contributed R packageCountr available from the Comprehensive R Archive Network.One of these flexible count distributions, the Weibull count distribution, was usedto derive our first forecasting model. Its predictive ability is compared to the modelspreviously suggested in the literature and tested in an algorithmic trading (betting)strategy. The model developed has been shown to perform rather well compared toits competitors.Our second forecasting model uses the same statistical distribution but modelsthe attack and defence strengths of each team at the players level rather than ata team level, as is systematically done in the literature. For this model we makeheavy use of the data on the players' attributes discussed in the data part of thethesis. Not only does this model turn out to have a higher predictive power but italso allows us to answer important questions about the `nature of the game' such asthe contribution of the full-backs to the attacking efforts or where would a new teamfinish in the Premier League.
Additional digital content not deposited electronically:
Countr : an R package to fit flexible count regression (available from Cran)
Thesis main supervisor(s):
Thesis co-supervisor(s):
Language:
en

Institutional metadata

University researcher(s):

Record metadata

Manchester eScholar ID:
uk-ac-man-scw:301066
Created by:
Kharrat, Tarak
Created:
26th May, 2016, 15:26:37
Last modified by:
Kharrat, Tarak
Last modified:
1st December, 2017, 09:09:02

Can we help?

The library chat service will be available from 11am-3pm Monday to Friday (excluding Bank Holidays). You can also email your enquiry to us.