In April 2016 Manchester eScholar was replaced by the University of Manchester’s new Research Information Management System, Pure. In the autumn the University’s research outputs will be available to search and browse via a new Research Portal. Until then the University’s full publication record can be accessed via a temporary portal and the old eScholar content is available to search and browse via this archive.

Related resources

Full-text held externally

University researcher(s)

Academic department(s)

Annotating genes and genomes with DNA sequences extracted from biomedical articles

Haeussler, Maximilian; Gerner, Martin; Bergman, Casey M

Bioinformatics. 2011;27(7):980.

Access to files

Full-text and supplementary files are not available from Manchester eScholar. Full-text is available externally using the following links:

Full-text held externally

Abstract

MOTIVATION: Increasing rates of publication and DNA sequencing make the problem of finding relevant articles for a particular gene or genomic region more challenging than ever. Existing text-mining approaches focus on finding gene names or identifiers in English text. These are often not unique and do not identify the exact genomic location of a study. RESULTS: Here, we report the results of a novel text-mining approach that extracts DNA sequences from biomedical articles and automatically maps them to genomic databases. We find that ∼20% of open access articles in PubMed central (PMC) have extractable DNA sequences that can be accurately mapped to the correct gene (91%) and genome (96%). We illustrate the utility of data extracted by text2genome from more than 150 000 PMC articles for the interpretation of ChIP-seq data and the design of quantitative reverse transcriptase (RT)-PCR experiments. Conclusion: Our approach links articles to genes and organisms without relying on gene names or identifiers. It also produces genome annotation tracks of the biomedical literature, thereby allowing researchers to use the power of modern genome browsers to access and analyze publications in the context of genomic data. Availability and implementation: Source code is available under a BSD license from http://sourceforge.net/projects/text2genome/ and results can be browsed and downloaded at http://text2genome.org. CONTACT: maximilianh@gmail.com SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

Bibliographic metadata

Type of resource:
Content type:
Publication type:
Published date:
Journal title:
ISSN:
Place of publication:
England
Volume:
27
Issue:
7
Start page:
980
Total:
1
Pagination:
980
Digital Object Identifier:
10.1093/bioinformatics/btr043
Pubmed Identifier:
21325301
Pii Identifier:
btr043
Access state:
Active

Institutional metadata

University researcher(s):
Academic department(s):

Record metadata

Manchester eScholar ID:
uk-ac-man-scw:121672
Created by:
Bergman, Casey
Created:
8th April, 2011, 09:47:56
Last modified by:
Bergman, Casey
Last modified:
6th March, 2016, 19:34:19

Can we help?

The library chat service will be available from 11am-3pm Monday to Friday (excluding Bank Holidays). You can also email your enquiry to us.