In April 2016 Manchester eScholar was replaced by the University of Manchester’s new Research Information Management System, Pure. In the autumn the University’s research outputs will be available to search and browse via a new Research Portal. Until then the University’s full publication record can be accessed via a temporary portal and the old eScholar content is available to search and browse via this archive.

Cost-efficient Resource Management for Scientific Workflows on the Cloud

Pietri, Ilia

[Thesis]. Manchester, UK: The University of Manchester; 2016.

Access to files

Abstract

Scientific workflows are used in many scientific fields to abstract complex computations (tasks) and data or flow dependencies between them. High performance computing (HPC) systems have been widely used for the execution of scientific workflows. Cloud computing has gained popularity by offering users on-demand provisioning of resources and providing the ability to choose from a wide range of possible configurations. To do so, resources are made available in the form of virtual machines (VMs), described as a set of resource characteristics, e.g. amount of CPU and memory. The notion of VMs enables the use of different resource combinations which facilitates the deployment of the applications and the management of the resources. A problem that arises is determining the configuration, such as the number and type of resources, that leads to efficient resource provisioning. For example, allocating a large amount of resources may reduce application execution time however at the expense of increased costs. This thesis investigates the challenges that arise on resource provisioning and task scheduling of scientific workflows and explores ways to address them, developing approaches to improve energy efficiency for scientific workflows and meet the user’s objectives, e.g. makespan and monetary cost. The motivation stems from the wide range of options that enable to select cost-efficient configurations and improve resource utilisation. The contributions of this thesis are the following. (i) A survey of the issues arising in resource management in cloud computing; The survey focuses on VM management, cost efficiency and the deployment of scientific workflows. (ii) A performance model to estimate the workflow execution time for a different number of resources based on the workflow structure; The model can be used to estimate the respective user and energy costs in order to determine configurations that lead to efficient resource provisioning and achieve a balance between various conflicting goals. (iii) Two energy-aware scheduling algorithms that maximise the number of completed workflows from an ensemble under energy and budget or deadline constraints; The algorithms address the problem of energy-aware resource provisioning and scheduling for scientific workflow ensembles. (iv) An energy-aware algorithm that selects the frequency to be used for each workflow task in order to achieve energy savings without exceeding the workflow deadline; The algorithm takes into account the different requirements and constraints that arise depending on the workflow and system characteristics. (v) Two cost-based frequency selection algorithms that choose the CPU frequency for each provisioned resource in order to achieve cost-efficient resource configurations for the user and complete the workflow within the deadline; Decision making is based on both the workflow characteristics and the pricing model of the provider.

Bibliographic metadata

Type of resource:
Content type:
Form of thesis:
Type of submission:
Degree type:
Doctor of Philosophy
Degree programme:
PhD Computer Science (CDT)
Publication date:
Location:
Manchester, UK
Total pages:
213
Abstract:
Scientific workflows are used in many scientific fields to abstract complex computations (tasks) and data or flow dependencies between them. High performance computing (HPC) systems have been widely used for the execution of scientific workflows. Cloud computing has gained popularity by offering users on-demand provisioning of resources and providing the ability to choose from a wide range of possible configurations. To do so, resources are made available in the form of virtual machines (VMs), described as a set of resource characteristics, e.g. amount of CPU and memory. The notion of VMs enables the use of different resource combinations which facilitates the deployment of the applications and the management of the resources. A problem that arises is determining the configuration, such as the number and type of resources, that leads to efficient resource provisioning. For example, allocating a large amount of resources may reduce application execution time however at the expense of increased costs. This thesis investigates the challenges that arise on resource provisioning and task scheduling of scientific workflows and explores ways to address them, developing approaches to improve energy efficiency for scientific workflows and meet the user’s objectives, e.g. makespan and monetary cost. The motivation stems from the wide range of options that enable to select cost-efficient configurations and improve resource utilisation. The contributions of this thesis are the following. (i) A survey of the issues arising in resource management in cloud computing; The survey focuses on VM management, cost efficiency and the deployment of scientific workflows. (ii) A performance model to estimate the workflow execution time for a different number of resources based on the workflow structure; The model can be used to estimate the respective user and energy costs in order to determine configurations that lead to efficient resource provisioning and achieve a balance between various conflicting goals. (iii) Two energy-aware scheduling algorithms that maximise the number of completed workflows from an ensemble under energy and budget or deadline constraints; The algorithms address the problem of energy-aware resource provisioning and scheduling for scientific workflow ensembles. (iv) An energy-aware algorithm that selects the frequency to be used for each workflow task in order to achieve energy savings without exceeding the workflow deadline; The algorithm takes into account the different requirements and constraints that arise depending on the workflow and system characteristics. (v) Two cost-based frequency selection algorithms that choose the CPU frequency for each provisioned resource in order to achieve cost-efficient resource configurations for the user and complete the workflow within the deadline; Decision making is based on both the workflow characteristics and the pricing model of the provider.
Thesis main supervisor(s):
Thesis co-supervisor(s):
Funder(s):
Language:
en

Institutional metadata

University researcher(s):

Record metadata

Manchester eScholar ID:
uk-ac-man-scw:295520
Created by:
Pietri, Ilia
Created:
20th January, 2016, 11:41:03
Last modified by:
Pietri, Ilia
Last modified:
1st December, 2017, 09:08:33

Can we help?

The library chat service will be available from 11am-3pm Monday to Friday (excluding Bank Holidays). You can also email your enquiry to us.