In April 2016 Manchester eScholar was replaced by the University of Manchester’s new Research Information Management System, Pure. In the autumn the University’s research outputs will be available to search and browse via a new Research Portal. Until then the University’s full publication record can be accessed via a temporary portal and the old eScholar content is available to search and browse via this archive.

Related resources

Search for item elsewhere

University researcher(s)

Academic department(s)

A General Framework for Building Accurate and Understandable Genomic Models: A Study in Rice (Oryza Sativa)

Orhobor, Oghenejokpeme Israel

[Thesis]. Manchester, UK: The University of Manchester; 2019.

Access to files

FULL-TEXT.PDF (pdf)

Abstract

Rapid technological advances in genotyping and sequencing technologies are driving the generation of vast amounts of genomic data. These advancements present a unique opportunity to improve our understanding of the environmental and genetic mechanisms that give rise to phenotypes. This data is technically hard to analyse because there are many attributes (often in the order of a million), and vast quantities of background knowledge is relevant. Genotype data are most commonly used in genomic models to identify genetic regions which control phenotypes and to predict the likelihood that members of a population will produce progeny with particular phenotypes. However, most of the data may be irrelevant for certain phenotypes, leading to suboptimal, difficult to understand models. To meet this challenge, we propose a three-stage general framework that incorporates background knowledge in its model building processes by applying feature stability, inductive logic programming (ILP), and meta-learning. In the first stage of the framework, we identify associated markers using marker stability rather than traditional mixed models. In the second stage we formalise the identified frequent patterns and additional background knowledge as predicates in first order logic, and using an ILP engine we identify frequent patterns which correspond to genetic configurations that are associated with a trait. Finally, the identified frequent patterns in the previous stage are used as additional data for phenotype prediction. We demonstrate that this framework (1) significantly outperforms the state-of-the-art in identifying associated genomic regions, (2) identifies relevant genetic configurations, and (3) improves overall phenotype prediction, using a diverse Rice (Oryza sativa) population.

Bibliographic metadata

Type of resource:

text

Content type:

Administered thesis

Form of thesis:

Traditional

Type of submission:

Doctoral level ETD - final

Thesis title:

A General Framework for Building Accurate and Understandable Genomic Models: A Study in Rice (Oryza Sativa)

Degree type:

Doctor of Philosophy

Degree programme:

PhD Computer Science (CDT)

Publication date:

2019-02-07T09:27:15

Institution:

The University of Manchester

Location:

Manchester, UK

Total pages:

161

Abstract:

Thesis main supervisor(s):

KING, ROSS RD

Thesis co-supervisor(s):

BROWN, GAVIN G

Degree grantor:

The University of Manchester

Language:

Institutional metadata

University researcher(s):

Orhobor, Oghenejokpeme

Academic department(s):

Record metadata

Manchester eScholar ID:

uk-ac-man-scw:318296

Created by:

Orhobor, Oghenejokpeme

Created:

7th February, 2019, 09:27:15

Last modified by:

Orhobor, Oghenejokpeme

Last modified:

8th February, 2019, 13:28:13