In April 2016 Manchester eScholar was replaced by the University of Manchester’s new Research Information Management System, Pure. In the autumn the University’s research outputs will be available to search and browse via a new Research Portal. Until then the University’s full publication record can be accessed via a temporary portal and the old eScholar content is available to search and browse via this archive.

Related resources

University researcher(s)

    Academic department(s)

      SCALABLE GAUSSIAN PROCESS METHODS FOR SINGLE-CELL DATA

      Ahmed, Sumon

      [Thesis]. Manchester, UK: The University of Manchester; 2020.

      Access to files

      Abstract

      The analysis of single-cell data creates the opportunity to examine the temporal dynamics of complex biological processes where the generation of time course experiments is challenging or technically impossible. One popular approach is to learn a lower dimensional manifold or trajectory through the data that captures major sources of variation in the data. Gene expression patterns can then be aligned through different lineages in the trajectory as smooth functions of pseudotime which promises to facilitate the identification of differentially expressed (DE) genes across trajectories. We briefly review some popular trajectory inference and downstream analysis methods along with their strengths and assumptions. We provide a brief overview of Gaussian process (GP) inference and describe how GPs can be used for dimensionality reduction and data association, which later facilitate probabilistic pseudotime estimation and downstream analysis to inferring DE genes and branching times. We present a scalable implementation of the Gaussian process latent variable model (GPLVM) and develop a pseudotime estimation method that scales to droplet-based large volume single-cell datasets and can be extended to higher dimensional latent spaces to capture other sources of variation such as branching dynamics. The model's efficacy is evaluated on a number of datasets from different organisms collected using different protocols. The model converges significantly faster compared to existing methods whilst achieving comparable estimation accuracy. We reimplement an existing downstream analysis method for identifying branching dynamics from bulk time series data and apply it on single-cell data after pseudotime inference, extending the models to model counts data. We also present the limitations of a recent approach to inference of branching dynamics in single-cell data and extend the model to mitigate its limitations. Our downstream analysis models are shown to successfully identify branching locations for individual genes when applied on simulated data and single-cell mouse haematopoietic stem cells (HSCs) data.

      Bibliographic metadata

      Type of resource:
      Content type:
      Form of thesis:
      Type of submission:
      Degree type:
      Doctor of Philosophy
      Degree programme:
      PhD Bioinformatics 3yr (IIDS)
      Publication date:
      Location:
      Manchester, UK
      Total pages:
      174
      Abstract:
      The analysis of single-cell data creates the opportunity to examine the temporal dynamics of complex biological processes where the generation of time course experiments is challenging or technically impossible. One popular approach is to learn a lower dimensional manifold or trajectory through the data that captures major sources of variation in the data. Gene expression patterns can then be aligned through different lineages in the trajectory as smooth functions of pseudotime which promises to facilitate the identification of differentially expressed (DE) genes across trajectories. We briefly review some popular trajectory inference and downstream analysis methods along with their strengths and assumptions. We provide a brief overview of Gaussian process (GP) inference and describe how GPs can be used for dimensionality reduction and data association, which later facilitate probabilistic pseudotime estimation and downstream analysis to inferring DE genes and branching times. We present a scalable implementation of the Gaussian process latent variable model (GPLVM) and develop a pseudotime estimation method that scales to droplet-based large volume single-cell datasets and can be extended to higher dimensional latent spaces to capture other sources of variation such as branching dynamics. The model's efficacy is evaluated on a number of datasets from different organisms collected using different protocols. The model converges significantly faster compared to existing methods whilst achieving comparable estimation accuracy. We reimplement an existing downstream analysis method for identifying branching dynamics from bulk time series data and apply it on single-cell data after pseudotime inference, extending the models to model counts data. We also present the limitations of a recent approach to inference of branching dynamics in single-cell data and extend the model to mitigate its limitations. Our downstream analysis models are shown to successfully identify branching locations for individual genes when applied on simulated data and single-cell mouse haematopoietic stem cells (HSCs) data.
      Thesis main supervisor(s):
      Thesis co-supervisor(s):
      Language:
      en

      Institutional metadata

      University researcher(s):
      Academic department(s):

        Record metadata

        Manchester eScholar ID:
        uk-ac-man-scw:323186
        Created by:
        Ahmed, Sumon
        Created:
        11th January, 2020, 18:56:30
        Last modified by:
        Ahmed, Sumon
        Last modified:
        6th February, 2020, 10:32:01

        Can we help?

        The library chat service will be available from 11am-3pm Monday to Friday (excluding Bank Holidays). You can also email your enquiry to us.