Related resources
Search for item elsewhere
University researcher(s)
Academic department(s)
Informatics tools for the analysis and assignment of phosphorylation status in proteomics
[Thesis]. Manchester, UK: The University of Manchester; 2015.
Access to files
- FULL-TEXT.PDF (pdf)
Abstract
Presently, progress in the field of phosphoproteomics has been accelerated by mass spectrometry. This is not a surprise owing to not only the accuracy, precision and high-throughput capabilities of MS but also due to the support it receives from informaticians whom allow the automated analysis; making the task of going from a complex sample to a statistically satisfactory set of phosphopeptides and corresponding site positions with relative ease. However, the process of identifying and subsequently pinpointing the phosphorylation moiety is not straightforward and remains a challenging task. Furthermore, it has been suggested that not all phosphorylation sites are of equal functional importance, to the extent that some may even lack function altogether. Clearly, such sites will confound the efforts towards functional characterisation. The work in this thesis is aimed at these two issues; accurate site localisation and functional annotation. To address the first issue, I adopt a multi-tool approach for identification and site localisation; utilising the different underlying algorithms of each tool and thereby allowing an orthogonal perspective on the same tandem mass spectra. Doing so enhanced accuracy over any single tool by itself. The power of this multi-tool approach stemmed from its ability to not predict more true positives but rather by removal of false positives. For the second issue, I first investigated the hypothesis that those of functional consequence exhibit stronger phosphorylation-characteristic features such as the degree of conservation and disorder. Indeed, it was found that some features were enriched for the functional group. More surprisingly, there were also some that were enriched for the less-functional; suggesting their incorporation into a prediction algorithm would hinder functional prediction. With this in mind, I train and optimise several machine-learning algorithms, using different combinations of features in an attempt to (separately) improve general phosphorylation and functional prediction.
Layman's Abstract
Phosphorylation is a key post-translational modification that is deeply embedded within the biological system. Its role, either directly or indirectly, is regulatory where it is responsible for a vast number of biological processes. As such, the task of pinpointing their precise locations and subsequently attempting to characterise their functional role is an active area of research. Presently, mass spectrometry-based strategies are the major players in the field of phosphoproteomics due to their ability to acquire a global snapshot of the phosphoproteome with relative ease and having an arsenal of software tools to process the data. This latter stage of identifying phosphopeptides and then pinpointing the precise phosphosite is a major challenge in the field. One reason in particular is that different software tools do not always agree with each other; causing confusion regarding the true identity of the site in question. However, even if one is able to perfectly pinpoint the phosphosite, they now face another problem; is the phosphosite functionally important? This is a question that has been recently been considered where some sites may have little or even no function. The work in this thesis was focused on addressing these two aspects of phosphorylation annotation. For the issue of correctly pinpointing the phosphosite, I adopt a multi-tool approach where I demonstrate improvements in both identification and localisation of phosphorylated species. With regards to functional prediction, in my work I search for properties that can help to discriminate between functional and non-functional phosphosites and create a predictor which is capable of largely avoiding the latter.