In April 2016 Manchester eScholar was replaced by the University of Manchester’s new Research Information Management System, Pure. In the autumn the University’s research outputs will be available to search and browse via a new Research Portal. Until then the University’s full publication record can be accessed via a temporary portal and the old eScholar content is available to search and browse via this archive.

Challenges in identifying asthma subgroups using unsupervised statistical learning techniques.

Prosperi, Mattia C F; Sahiner, Umit M; Belgrave, Danielle; Sackesen, Cansin; Buchan, Iain E; Simpson, Angela; Yavuz, Tolga S; Kalayci, Omer; Custovic, Adnan

American journal of respiratory and critical care medicine. 2013;188(11):1303-12.

Access to files

Full-text and supplementary files are not available from Manchester eScholar. Full-text is available externally using the following links:

Full-text held externally

Abstract

RATIONALE: Unsupervised statistical learning techniques, such as exploratory factor analysis (EFA) and hierarchical clustering (HC), have been used to identify asthma phenotypes, with partly consistent results. Some of the inconsistency is caused by the variable selection and demographic and clinical differences among study populations. OBJECTIVES: To investigate the effects of the choice of statistical method and different preparations of data on the clustering results; and to relate these to disease severity. METHODS: Several variants of EFA and HC were applied and compared using various sets of variables and different encodings and transformations within a dataset of 383 children with asthma. Variables included lung function, inflammatory and allergy markers, family history, environmental exposures, and medications. Clusters and original variables were related to asthma severity (logistic regression and Bayesian network analysis). MEASUREMENTS AND MAIN RESULTS: EFA identified five components (eigenvalues ≥ 1) explaining 35% of the overall variance. Variations of the HC (as linkage-distance functions) did not affect the cluster inference; however, using different variable encodings and transformations did. The derived clusters predicted asthma severity less than the original variables. Prognostic factors of severity were medication usage, current symptoms, lung function, paternal asthma, body mass index, and age of asthma onset. Bayesian networks indicated conditional dependence among variables. CONCLUSIONS: The use of different unsupervised statistical learning methods and different variable sets and encodings can lead to multiple and inconsistent subgroupings of asthma, not necessarily correlated with severity. The search for asthma phenotypes needs more careful selection of markers, consistent across different study populations, and more cautious interpretation of results from unsupervised learning.

Bibliographic metadata

Type of resource:
Content type:
Publication type:
Published date:
Abbreviated journal title:
ISSN:
Place of publication:
United States
Volume:
188
Issue:
11
Pagination:
1303-12
Digital Object Identifier:
10.1164/rccm.201304-0694OC
Pubmed Identifier:
24180417
Access state:
Active

Institutional metadata

University researcher(s):

Record metadata

Manchester eScholar ID:
uk-ac-man-scw:242249
Created by:
Heydon, Kirsty
Created:
5th December, 2014, 12:27:47
Last modified by:
Heydon, Kirsty
Last modified:
5th December, 2014, 12:27:47

Can we help?

The library chat service will be available from 11am-3pm Monday to Friday (excluding Bank Holidays). You can also email your enquiry to us.