In April 2016 Manchester eScholar was replaced by the University of Manchester’s new Research Information Management System, Pure. In the autumn the University’s research outputs will be available to search and browse via a new Research Portal. Until then the University’s full publication record can be accessed via a temporary portal and the old eScholar content is available to search and browse via this archive.

Related resources

Search for item elsewhere

University researcher(s)

Academic department(s)

Representation learning with a temporally coherent mixed-representation

Parkinson, Jon Charles

[Thesis]. Manchester, UK: The University of Manchester; 2017.

Access to files

FULL-TEXT.PDF (pdf)

Abstract

Guiding a representation towards capturing temporally coherent aspects present invideo improves object identity encoding. Existing models apply temporal coherenceuniformly over all features based on the assumption that optimal encoding of objectidentity only requires temporally stable components. We test the validity of this assumptionby exploring the effects of applying a mixture of temporally coherent invariantfeatures, alongside variable features, in a single ‘mixed’ representation. Applyingtemporal coherence to different proportions of the available features, we evaluate arange of models on a supervised object classification task. This series of experimentswas tested on three video datasets, each with a different complexity of object shape andmotion. We also investigated whether a mixed-representation improves the capture ofinformation components associated with object position, alongside object identity, ina single representation. Tests were initially applied using a single layer autoencoderas a test bed, followed by subsequent tests investigating whether similar behaviouroccurred in the more abstract features learned by a deep network. A representationapplying temporal coherence in some fashion produced the best results in all tests,on both single layered and deep networks. The majority of tests favoured a mixed representation,especially in cases where the quantity of labelled data available to thesupervised task was plentiful. This work is the first time a mixed-representation hasbeen investigated, and demonstrates its use as a method for representation learning.

Layman's Abstract

Guiding a representation towards capturing temporally coherent aspects present in video improves object identity encoding.Existing models apply temporal coherence uniformly over all features based on the assumption that optimal encoding of object identity only requires temporally stable components.We test the validity of this assumption by exploring the effects of applying a mixture of temporally coherent invariant features, alongside variable features, in a single `mixed' representation. Applying temporal coherence to different proportions of the available features, we evaluate a range of models on a supervised object classification task.This series of experiments was tested on three video datasets, each with a different complexity of object shape and motion. We also investigated whether a mixed-representation improves the capture of information components associated with object position, alongside object identity, in a single representation. Tests were initially applied using a single layer autoencoder as a test bed, followed by subsequent tests investigating whether similar behaviour occurred in the more abstract features learned by a deep network.A representation applying temporal coherence in some fashion produced the best results in all tests, on both single layered and deep networks.The majority of tests favoured a mixed-representation, especially in cases where the quantity of labelled data available to the supervised task was plentiful.This work is the first time a mixed-representation has been investigated, and demonstrates its use as a method for representation learning.

Keyword(s)

Autoencoders; Computer vision; Neural Networks; Representation learning; Temporal coherence; Unsupervised learning

Bibliographic metadata

Type of resource:

text

Content type:

Administered thesis

Form of thesis:

Traditional

Type of submission:

Doctoral level ETD - final

Thesis title:

Representation learning with a temporally coherent mixed-representation

Degree type:

Doctor of Philosophy

Degree programme:

PhD Computer Science (CDT)

Publication date:

2017-05-04T21:38:24

Institution:

The University of Manchester

Location:

Manchester, UK

Total pages:

190

Abstract:

Layman's abstract:

Guiding a representation towards capturing temporally coherent aspects present in video improves object identity encoding.Existing models apply temporal coherence uniformly over all features based on the assumption that optimal encoding of object identity only requires temporally stable components.We test the validity of this assumption by exploring the effects of applying a mixture of temporally coherent invariant features, alongside variable features, in a single `mixed' representation. Applying temporal coherence to different proportions of the available features, we evaluate a range of models on a supervised object classification task.This series of experiments was tested on three video datasets, each with a different complexity of object shape and motion. We also investigated whether a mixed-representation improves the capture of information components associated with object position, alongside object identity, in a single representation. Tests were initially applied using a single layer autoencoder as a test bed, followed by subsequent tests investigating whether similar behaviour occurred in the more abstract features learned by a deep network.A representation applying temporal coherence in some fashion produced the best results in all tests, on both single layered and deep networks.The majority of tests favoured a mixed-representation, especially in cases where the quantity of labelled data available to the supervised task was plentiful.This work is the first time a mixed-representation has been investigated, and demonstrates its use as a method for representation learning.

Additional digital content not deposited electronically:

Code was produced to run the various neural networks described in this work. This code is being prepared and will be passed to my supervisor for the appropriate storage in the next week.

Non-digital content not deposited electronically:

None

Keyword(s):

Thesis main supervisor(s):

CHEN, KE K

Thesis co-supervisor(s):

SHAPIRO, JONATHAN JL

Degree grantor:

The University of Manchester

Language:

Institutional metadata

University researcher(s):

Parkinson, Jon

Academic department(s):

Record metadata

Manchester eScholar ID:

uk-ac-man-scw:308982

Created by:

Parkinson, Jon

Created:

4th May, 2017, 20:38:24

Last modified by:

Parkinson, Jon

Last modified:

7th September, 2017, 12:32:40