In April 2016 Manchester eScholar was replaced by the University of Manchester’s new Research Information Management System, Pure. In the autumn the University’s research outputs will be available to search and browse via a new Research Portal. Until then the University’s full publication record can be accessed via a temporary portal and the old eScholar content is available to search and browse via this archive.

Related resources

Search for item elsewhere

University researcher(s)

Academic department(s)

Predictive Variable Selection for Subgroup Identification

Turner, Emily

[Thesis]. Manchester, UK: The University of Manchester; 2017.

Access to files

FULL-TEXT.PDF (pdf)

Abstract

The problem of exploratory subgroup identification can be broken down into three steps. The first step is to identify predictive features, the second is to identify the interesting regions on those features, and the third is to estimate the properties of the subgroup region, such as subgroup size and the predicted recovery outcome for individuals belonging to this subgroup. While most work in this field analyses the full subgroup identification procedure, we provide an in-depth examination of the first step, predictive feature identification. A feature is defined as predictive if it interacts with a treatment to affect the recovery outcome. We compare three prominent methods for exploratory subgroup identification: Vir- tual Twins (Foster et al. 2011), SIDES (Subgroup Identification based on Differential Effect Search, Lipkovich et al. 2011) and GUIDE (Generalised, Unbiased Interaction Detection and Estimation, Loh et al. 2015). First, we provide a theoretical interpretation of the problem of predictive variable selection and connect it with the three methods. We believe that bringing different approaches under a common analytical framework facilitates a clearer comparison of each. We show that Virtual Twins and SIDES select interesting features in a theoretically similar way, so that the essential difference between the two is in the way in which this selection mechanism is implemented in their respective subgroup identification procedures. Second, we undertake an experimental analysis of the three. In order to do this, we apply each method to return a predictive variable importance measure (PVIMs), which we use to rank features in order of their predictiveness. We then evaluate and compare how well each method performs at this task. Although each of Virtual Twins, SIDES and GUIDE either output a PVIM or require minor adaptations to do so, their strengths and weaknesses as PVIMs had not been explored prior to this work. We argue that a variable ranking approach is a particularly good solution to the problem of subgroup identification. Because clinical trials often lack the power to identify predictive features with statistical significance, predictive variable scoring and ranking may be more appropriate than a full subgroup identification procedure. PVIMs enable a clinician to visualise the relative importance of each feature in a straightforward manner and to use clinical expertise to scrutinise the findings of the algorithm. Our conclusions are that Virtual Twins performs best in terms of predictive feature selection, outperforming SIDES and GUIDE on every type of data set. However, it appears to have weaknesses in distinguishing between predictive and prognostic biomarkers. Finally, we note that there is a need to provide common data sets on which new methods can be evaluated. We show that there is a tendency towards testing new subgroup identification methods on data sets that demonstrate the strengths of the algorithm and hide its weaknesses.

Keyword(s)

Interaction detection; Recursive partitioning; Subgroup identification

Bibliographic metadata

Type of resource:

text

Content type:

Administered thesis

Form of thesis:

Traditional

Type of submission:

Doctoral level ETD - final

Thesis title:

Predictive Variable Selection for Subgroup Identification

Degree type:

Master of Philosophy

Degree programme:

MPhil Computer Science

Publication date:

2017-12-21T14:52:04

Institution:

The University of Manchester

Location:

Manchester, UK

Total pages:

Abstract:

Keyword(s):

Thesis main supervisor(s):

BROWN, GAVIN G

Thesis co-supervisor(s):

STEVENS, ROBERT RD

Degree grantor:

The University of Manchester

Language:

Institutional metadata

University researcher(s):

Turner, Emily

Academic department(s):

Record metadata

Manchester eScholar ID:

uk-ac-man-scw:312697

Created by:

Turner, Emily

Created:

21st December, 2017, 14:52:04

Last modified by:

Turner, Emily

Last modified:

3rd January, 2018, 13:42:04