In April 2016 Manchester eScholar was replaced by the University of Manchester’s new Research Information Management System, Pure. In the autumn the University’s research outputs will be available to search and browse via a new Research Portal. Until then the University’s full publication record can be accessed via a temporary portal and the old eScholar content is available to search and browse via this archive.

Improved Load-balancing for a Chord-based Peer-to-Peer Storage System in a Cluster Environment

Chen, Fu

[Thesis]. Manchester, UK: The University of Manchester; 2015.

Access to files

Abstract

The thesis investigates deployment of a Peer-to-Peer storage system in a cluster environment, in which machines have good and persist network connection, in order to provide the functionality of a data centre. For various reasons, the implementation is based on the Peer-to-Peer system known as Chord. Chord naturally provides storage load-balancing, especially if its virtual node scheme is used, but this needs to be improved if Chord is used to implement a storage system. A novel, threshold-based storage load-balancing scheme is proposed. Each machine in the system contributes a fixed amount of disk storage space to the Peer-to-Peer storage system. The system commences operation in the normal Chord manner except that two distinct sets of tables are initialised, one to maintain the usual Chord Ring, and one to maintain proximity information about the machines in the system. As files are inserted, the collective storage space gradually fills up. When any machine reaches the threshold for usage of its contributed space, the system behaviour is modified. Attempts are made, repeatedly if necessary, to migrate virtual nodes from heavily loaded machines to less-heavily loaded machines elsewhere in the system. The proximity information is used so as to minimise the costs of this migration. The nature of the proximity information is complex, and a Space-Filling Curve is utilised to reduce the complexity. For reasons of effectiveness, demonstrated by an evaluation against other kinds of Space-Filling Curve, the Hilbert curve is specifically chosen. The performance of the resulting implementation is evaluated in a practical experimental environment which consists of five teaching laboratories in the author’s school. Under the specific conditions of the experiments, the new system achieves significantly better distribution of storage utilisation across the participating machines and also defers the onset of unreliable behaviour in the system. In one experiment, the amount of the total storage space available that is actually utilised by the system increased from ∼ 43% to ∼ 62% using the proposed mechanism. The parameters used in the experiments have been chosen somewhat arbitrarily, so it is possible that even better results might be feasible.

Bibliographic metadata

Type of resource:
Content type:
Form of thesis:
Type of submission:
Degree type:
Doctor of Philosophy
Degree programme:
PhD Computer Science
Publication date:
Location:
Manchester, UK
Total pages:
158
Abstract:
The thesis investigates deployment of a Peer-to-Peer storage system in a cluster environment, in which machines have good and persist network connection, in order to provide the functionality of a data centre. For various reasons, the implementation is based on the Peer-to-Peer system known as Chord. Chord naturally provides storage load-balancing, especially if its virtual node scheme is used, but this needs to be improved if Chord is used to implement a storage system. A novel, threshold-based storage load-balancing scheme is proposed. Each machine in the system contributes a fixed amount of disk storage space to the Peer-to-Peer storage system. The system commences operation in the normal Chord manner except that two distinct sets of tables are initialised, one to maintain the usual Chord Ring, and one to maintain proximity information about the machines in the system. As files are inserted, the collective storage space gradually fills up. When any machine reaches the threshold for usage of its contributed space, the system behaviour is modified. Attempts are made, repeatedly if necessary, to migrate virtual nodes from heavily loaded machines to less-heavily loaded machines elsewhere in the system. The proximity information is used so as to minimise the costs of this migration. The nature of the proximity information is complex, and a Space-Filling Curve is utilised to reduce the complexity. For reasons of effectiveness, demonstrated by an evaluation against other kinds of Space-Filling Curve, the Hilbert curve is specifically chosen. The performance of the resulting implementation is evaluated in a practical experimental environment which consists of five teaching laboratories in the author’s school. Under the specific conditions of the experiments, the new system achieves significantly better distribution of storage utilisation across the participating machines and also defers the onset of unreliable behaviour in the system. In one experiment, the amount of the total storage space available that is actually utilised by the system increased from ∼ 43% to ∼ 62% using the proposed mechanism. The parameters used in the experiments have been chosen somewhat arbitrarily, so it is possible that even better results might be feasible.
Thesis main supervisor(s):
Thesis co-supervisor(s):
Language:
en

Institutional metadata

University researcher(s):

Record metadata

Manchester eScholar ID:
uk-ac-man-scw:275817
Created by:
Chen, Fu
Created:
19th October, 2015, 08:00:40
Last modified by:
Chen, Fu
Last modified:
17th November, 2017, 08:44:32

Can we help?

The library chat service will be available from 11am-3pm Monday to Friday (excluding Bank Holidays). You can also email your enquiry to us.