In April 2016 Manchester eScholar was replaced by the University of Manchester’s new Research Information Management System, Pure. In the autumn the University’s research outputs will be available to search and browse via a new Research Portal. Until then the University’s full publication record can be accessed via a temporary portal and the old eScholar content is available to search and browse via this archive.

Informatics tools for the analysis and assignment of phosphorylation status in proteomics

Lee, Dave

[Thesis]. Manchester, UK: The University of Manchester; 2015.

Access to files

Abstract

Presently, progress in the field of phosphoproteomics has been accelerated by mass spectrometry. This is not a surprise owing to not only the accuracy, precision and high-throughput capabilities of MS but also due to the support it receives from informaticians whom allow the automated analysis; making the task of going from a complex sample to a statistically satisfactory set of phosphopeptides and corresponding site positions with relative ease. However, the process of identifying and subsequently pinpointing the phosphorylation moiety is not straightforward and remains a challenging task. Furthermore, it has been suggested that not all phosphorylation sites are of equal functional importance, to the extent that some may even lack function altogether. Clearly, such sites will confound the efforts towards functional characterisation. The work in this thesis is aimed at these two issues; accurate site localisation and functional annotation. To address the first issue, I adopt a multi-tool approach for identification and site localisation; utilising the different underlying algorithms of each tool and thereby allowing an orthogonal perspective on the same tandem mass spectra. Doing so enhanced accuracy over any single tool by itself. The power of this multi-tool approach stemmed from its ability to not predict more true positives but rather by removal of false positives. For the second issue, I first investigated the hypothesis that those of functional consequence exhibit stronger phosphorylation-characteristic features such as the degree of conservation and disorder. Indeed, it was found that some features were enriched for the functional group. More surprisingly, there were also some that were enriched for the less-functional; suggesting their incorporation into a prediction algorithm would hinder functional prediction. With this in mind, I train and optimise several machine-learning algorithms, using different combinations of features in an attempt to (separately) improve general phosphorylation and functional prediction.

Layman's Abstract

Phosphorylation is a key post-translational modification that is deeply embedded within the biological system. Its role, either directly or indirectly, is regulatory where it is responsible for a vast number of biological processes. As such, the task of pinpointing their precise locations and subsequently attempting to characterise their functional role is an active area of research. Presently, mass spectrometry-based strategies are the major players in the field of phosphoproteomics due to their ability to acquire a global snapshot of the phosphoproteome with relative ease and having an arsenal of software tools to process the data. This latter stage of identifying phosphopeptides and then pinpointing the precise phosphosite is a major challenge in the field. One reason in particular is that different software tools do not always agree with each other; causing confusion regarding the true identity of the site in question. However, even if one is able to perfectly pinpoint the phosphosite, they now face another problem; is the phosphosite functionally important? This is a question that has been recently been considered where some sites may have little or even no function. The work in this thesis was focused on addressing these two aspects of phosphorylation annotation. For the issue of correctly pinpointing the phosphosite, I adopt a multi-tool approach where I demonstrate improvements in both identification and localisation of phosphorylated species. With regards to functional prediction, in my work I search for properties that can help to discriminate between functional and non-functional phosphosites and create a predictor which is capable of largely avoiding the latter.

Bibliographic metadata

Type of resource:
Content type:
Form of thesis:
Type of submission:
Degree programme:
PhD DTC Systems Biology (FLS)
Publication date:
Location:
Manchester, UK
Total pages:
177
Abstract:
Presently, progress in the field of phosphoproteomics has been accelerated by mass spectrometry. This is not a surprise owing to not only the accuracy, precision and high-throughput capabilities of MS but also due to the support it receives from informaticians whom allow the automated analysis; making the task of going from a complex sample to a statistically satisfactory set of phosphopeptides and corresponding site positions with relative ease. However, the process of identifying and subsequently pinpointing the phosphorylation moiety is not straightforward and remains a challenging task. Furthermore, it has been suggested that not all phosphorylation sites are of equal functional importance, to the extent that some may even lack function altogether. Clearly, such sites will confound the efforts towards functional characterisation. The work in this thesis is aimed at these two issues; accurate site localisation and functional annotation. To address the first issue, I adopt a multi-tool approach for identification and site localisation; utilising the different underlying algorithms of each tool and thereby allowing an orthogonal perspective on the same tandem mass spectra. Doing so enhanced accuracy over any single tool by itself. The power of this multi-tool approach stemmed from its ability to not predict more true positives but rather by removal of false positives. For the second issue, I first investigated the hypothesis that those of functional consequence exhibit stronger phosphorylation-characteristic features such as the degree of conservation and disorder. Indeed, it was found that some features were enriched for the functional group. More surprisingly, there were also some that were enriched for the less-functional; suggesting their incorporation into a prediction algorithm would hinder functional prediction. With this in mind, I train and optimise several machine-learning algorithms, using different combinations of features in an attempt to (separately) improve general phosphorylation and functional prediction.
Layman's abstract:
Phosphorylation is a key post-translational modification that is deeply embedded within the biological system. Its role, either directly or indirectly, is regulatory where it is responsible for a vast number of biological processes. As such, the task of pinpointing their precise locations and subsequently attempting to characterise their functional role is an active area of research. Presently, mass spectrometry-based strategies are the major players in the field of phosphoproteomics due to their ability to acquire a global snapshot of the phosphoproteome with relative ease and having an arsenal of software tools to process the data. This latter stage of identifying phosphopeptides and then pinpointing the precise phosphosite is a major challenge in the field. One reason in particular is that different software tools do not always agree with each other; causing confusion regarding the true identity of the site in question. However, even if one is able to perfectly pinpoint the phosphosite, they now face another problem; is the phosphosite functionally important? This is a question that has been recently been considered where some sites may have little or even no function. The work in this thesis was focused on addressing these two aspects of phosphorylation annotation. For the issue of correctly pinpointing the phosphosite, I adopt a multi-tool approach where I demonstrate improvements in both identification and localisation of phosphorylated species. With regards to functional prediction, in my work I search for properties that can help to discriminate between functional and non-functional phosphosites and create a predictor which is capable of largely avoiding the latter.
Thesis main supervisor(s):
Thesis co-supervisor(s):
Language:
en

Institutional metadata

University researcher(s):

Record metadata

Manchester eScholar ID:
uk-ac-man-scw:259036
Created by:
Lee, Dave
Created:
9th February, 2015, 20:42:16
Last modified by:
Lee, Dave
Last modified:
16th November, 2017, 14:24:41

Can we help?

The library chat service will be available from 11am-3pm Monday to Friday (excluding Bank Holidays). You can also email your enquiry to us.