In April 2016 Manchester eScholar was replaced by the University of Manchester’s new Research Information Management System, Pure. In the autumn the University’s research outputs will be available to search and browse via a new Research Portal. Until then the University’s full publication record can be accessed via a temporary portal and the old eScholar content is available to search and browse via this archive.

Related resources

Full-text held externally

University researcher(s)

Academic department(s)

LINNAEUS: A species name identification system for biomedical literature.

Gerner, Martin; Nenadic, Goran; Bergman, Casey M

BMC Bioinformatics. 2010;11:85.

Access to files

Full-text and supplementary files are not available from Manchester eScholar. Full-text is available externally using the following links:

Full-text held externally

Abstract

ABSTRACT: BACKGROUND: The task of recognizing and identifying species names in biomedical literature has recently been regarded as critical for a number of applications in text and data mining, including gene name recognition, species-specific document retrieval, and semantic enrichment of biomedical articles. RESULTS: In this paper we describe an open-source species name recognition and normalization software system, LINNAEUS, and evaluate its performance relative to several automatically generated biomedical corpora, as well as a novel corpus of full-text documents manually annotated for species mentions. LINNAEUS uses a dictionary-based approach (implemented as an efficient deterministic finite-state automaton) to identify species names and a set of heuristics to resolve ambiguous mentions. When compared against our manually annotated corpus, LINNAEUS performs with 94% recall and 97% precision at the mention level, and 98% recall and 90% precision at the document level. Our system successfully solves the problem of disambiguating uncertain species mentions, with 97% of all mentions in PubMed Central full-text documents resolved to unambiguous NCBI taxonomy identifiers. CONCLUSIONS: LINNAEUS is an open source, stand-alone software system capable of recognizing and normalizing species name mentions with speed and accuracy, and can therefore be integrated into a range of bioinformatics and text-mining applications. The software and manually annotated corpus can be downloaded freely at http://linnaeus.sourceforge.net/.

Bibliographic metadata

Content type:
Publication type:
Publication form:
Published date:
Language:
eng
Journal title:
Abbreviated journal title:
ISSN:
Volume:
11
Pagination:
85
Digital Object Identifier:
10.1186/1471-2105-11-85
Pubmed Identifier:
20149233
Pii Identifier:
1471-2105-11-85
Access state:
Active

Institutional metadata

University researcher(s):
Academic department(s):

Record metadata

Manchester eScholar ID:
uk-ac-man-scw:76745
Created by:
Bergman, Casey
Created:
14th February, 2010, 21:00:45
Last modified by:
Bergman, Casey
Last modified:
6th March, 2016, 19:32:41

Can we help?

The library chat service will be available from 11am-3pm Monday to Friday (excluding Bank Holidays). You can also email your enquiry to us.