In April 2016 Manchester eScholar was replaced by the University of Manchester’s new Research Information Management System, Pure. In the autumn the University’s research outputs will be available to search and browse via a new Research Portal. Until then the University’s full publication record can be accessed via a temporary portal and the old eScholar content is available to search and browse via this archive.

Diversity, Margins and Non-Stationary Learning

Stapenhurst, Richard John

[Thesis]. Manchester, UK: The University of Manchester; 2012.

Access to files

Abstract

Ensemble methods are frequently applied to classification problems, and gen-erally improve upon the performance of individual models. Diversity is consideredto be an important factor in this performance improvement; in the literature thereis strong support for the idea that high diversity is crucial in ensembles. Votingmargins provide an alternative explanation of the behaviour of ensembles; theyhave been prominently used in the interpretation of the Adaboost algorithm, andthe literature suggests that large margins are beneficial. In this thesis, we exam-ine these two quantities — which in both cases the literature suggests should beincreased — and show that (in 2-class problems) they are inversely related, highdiversity corresponding to small absolute margins. From this it can be seen thatthe views expressed in the literature are contradictory; we argue that ensemblebehaviour can be sufficiently understood without the need to quantify ‘diversity’.However, in non-stationary learning scenarios — where we must process datathat is not independent and identically distributed — the model must not onlygeneralise well, but also adapt to changes in the distribution. Building on thework of Minku, we hypothesise that high diversity might be of special significancein such problems in determining the rate at which the model can adapt. We usethe correspondence between diversity and margins to formulate the reasoning be-hind this intuition formally, and then derive an algorithm that explicitly managesdiversity in order to test this hypothesis. An empirical investigation shows thatmanaging diversity can, under certain conditions, improve the ability of an ensem-ble to adapt to a new concept; however, it typically seems that other aspects ofthe learning algorithm, especially concept change detection, have a substantiallylarger impact on performance than diversity does.

Bibliographic metadata

Type of resource:
Content type:
Form of thesis:
Type of submission:
Degree type:
Doctor of Philosophy
Degree programme:
PhD Computer Science
Publication date:
Location:
Manchester, UK
Total pages:
188
Abstract:
Ensemble methods are frequently applied to classification problems, and gen-erally improve upon the performance of individual models. Diversity is consideredto be an important factor in this performance improvement; in the literature thereis strong support for the idea that high diversity is crucial in ensembles. Votingmargins provide an alternative explanation of the behaviour of ensembles; theyhave been prominently used in the interpretation of the Adaboost algorithm, andthe literature suggests that large margins are beneficial. In this thesis, we exam-ine these two quantities — which in both cases the literature suggests should beincreased — and show that (in 2-class problems) they are inversely related, highdiversity corresponding to small absolute margins. From this it can be seen thatthe views expressed in the literature are contradictory; we argue that ensemblebehaviour can be sufficiently understood without the need to quantify ‘diversity’.However, in non-stationary learning scenarios — where we must process datathat is not independent and identically distributed — the model must not onlygeneralise well, but also adapt to changes in the distribution. Building on thework of Minku, we hypothesise that high diversity might be of special significancein such problems in determining the rate at which the model can adapt. We usethe correspondence between diversity and margins to formulate the reasoning be-hind this intuition formally, and then derive an algorithm that explicitly managesdiversity in order to test this hypothesis. An empirical investigation shows thatmanaging diversity can, under certain conditions, improve the ability of an ensem-ble to adapt to a new concept; however, it typically seems that other aspects ofthe learning algorithm, especially concept change detection, have a substantiallylarger impact on performance than diversity does.
Thesis main supervisor(s):
Thesis advisor(s):
Language:
en

Institutional metadata

University researcher(s):

Record metadata

Manchester eScholar ID:
uk-ac-man-scw:183554
Created by:
Stapenhurst, Richard
Created:
16th December, 2012, 16:54:43
Last modified by:
Stapenhurst, Richard
Last modified:
16th May, 2013, 18:46:03

Can we help?

The library chat service will be available from 11am-3pm Monday to Friday (excluding Bank Holidays). You can also email your enquiry to us.