In April 2016 Manchester eScholar was replaced by the University of Manchester’s new Research Information Management System, Pure. In the autumn the University’s research outputs will be available to search and browse via a new Research Portal. Until then the University’s full publication record can be accessed via a temporary portal and the old eScholar content is available to search and browse via this archive.

Related resources

Full-text held externally

University researcher(s)

Academic department(s)

pubmed2ensembl: A Resource for Mining the Biological Literature on Genes

Baran, Joachim; Gerner, Martin; Haeussler, Maximilian; Nenadic, Goran; Bergman, Casey M

PLoS ONE. 2011;6(9).

Access to files

Full-text and supplementary files are not available from Manchester eScholar. Full-text is available externally using the following links:

Full-text held externally

Abstract

BACKGROUND: The last two decades have witnessed a dramatic acceleration in the production of genomic sequence information and publication of biomedical articles. Despite the fact that genome sequence data and publications are two of the most heavily relied-upon sources of information for many biologists, very little effort has been made to systematically integrate data from genomic sequences directly with the biological literature. For a limited number of model organisms dedicated teams manually curate publications about genes; however for species with no such dedicated staff many thousands of articles are never mapped to genes or genomic regions. METHODOLOGY/PRINCIPAL FINDINGS: To overcome the lack of integration between genomic data and biological literature, we have developed pubmed2ensembl (http://www.pubmed2ensembl.org), an extension to the BioMart system that links over 2,000,000 articles in PubMed to nearly 150,000 genes in Ensembl from 50 species. We use several sources of curated (e.g., Entrez Gene) and automatically generated (e.g., gene names extracted through text-mining on MEDLINE records) sources of gene-publication links, allowing users to filter and combine different data sources to suit their individual needs for information extraction and biological discovery. In addition to extending the Ensembl BioMart database to include published information on genes, we also implemented a scripting language for automated BioMart construction and a novel BioMart interface that allows text-based queries to be performed against PubMed and PubMed Central documents in conjunction with constraints on genomic features. Finally, we illustrate the potential of pubmed2ensembl through typical use cases that involve integrated queries across the biomedical literature and genomic data. CONCLUSION/SIGNIFICANCE: By allowing biologists to find the relevant literature on specific genomic regions or sets of functionally related genes more easily, pubmed2ensembl offers a much-needed genome informatics inspired solution to accessing the ever-increasing biomedical literature.

Bibliographic metadata

Type of resource:
Content type:
Publication type:
Published date:
Journal title:
ISSN:
Place of publication:
United States
Volume:
6
Issue:
9
Digital Object Identifier:
10.1371/journal.pone.0024716
Pubmed Identifier:
21980353
Pii Identifier:
PONE-D-11-09930
Access state:
Active

Institutional metadata

University researcher(s):
Academic department(s):

Record metadata

Manchester eScholar ID:
uk-ac-man-scw:132782
Created by:
Bergman, Casey
Created:
10th October, 2011, 14:41:43
Last modified by:
Bergman, Casey
Last modified:
6th March, 2016, 19:33:55

Can we help?

The library chat service will be available from 11am-3pm Monday to Friday (excluding Bank Holidays). You can also email your enquiry to us.