In April 2016 Manchester eScholar was replaced by the University of Manchester’s new Research Information Management System, Pure. In the autumn the University’s research outputs will be available to search and browse via a new Research Portal. Until then the University’s full publication record can be accessed via a temporary portal and the old eScholar content is available to search and browse via this archive.

EXPLOITING PHONOLOGICALCONSTRAINTS AND AUTOMATICIDENTIFICATION OF SPEAKERCLASSES FOR ARABIC SPEECHRECOGNITION

Alsharhan, Eiman T a e a -

[Thesis]. Manchester, UK: The University of Manchester; 2014.

Access to files

Abstract

The aim of this thesis is to investigate a number of factors that could affect theperformance of an Arabic automatic speech understanding (ASU) system. The workdescribed in this thesis belongs to the speech recognition (ASR) phase, but the factthat it is part of an ASU project rather than a stand-alone piece of work on ASR influences the way in which it will be carried out. Our main concern in this work is todetermine the best way to exploit the phonological properties of the Arabic language inorder to improve the performance of the speech recogniser. One of the main challengesfacing the processing of Arabic is the effect of the local context, which induces changesin the phonetic representation of a given text, thereby causing the recognition engineto misclassify it. The proposed solution is to develop a set of language-dependentgrapheme-to-allophone rules that can predict such allophonic variations and eventuallyprovide a phonetic transcription that is sensitive to the local context for the ASRsystem. The novel aspect of this method is that the pronunciation of each word is extracteddirectly from a context-sensitive phonetic transcription rather than a predefineddictionary that typically does not reflect the actual pronunciation of the word. Besidesinvestigating the boundary effect on pronunciation, the research also seeks to addressthe problem of Arabic's complex morphology. Two solutions are proposed to tackle thisproblem, namely, using underspecified phonetic transcription to build the system, andusing phonemes instead of words to build the Hidden Markov Models (HMMs). Theresearch also seeks to investigate several technical settings that might have an effect onthe system's performance. These include training on the sub-population to minimisethe variation caused by training on the main undifferentiated population, as well asinvestigating the correlation between training size and performance of the ASR system.

Bibliographic metadata

Type of resource:
Content type:
Form of thesis:
Type of submission:
Degree type:
Doctor of Philosophy
Degree programme:
PhD Computer Science
Publication date:
Location:
Manchester, UK
Total pages:
244
Abstract:
The aim of this thesis is to investigate a number of factors that could affect theperformance of an Arabic automatic speech understanding (ASU) system. The workdescribed in this thesis belongs to the speech recognition (ASR) phase, but the factthat it is part of an ASU project rather than a stand-alone piece of work on ASR influences the way in which it will be carried out. Our main concern in this work is todetermine the best way to exploit the phonological properties of the Arabic language inorder to improve the performance of the speech recogniser. One of the main challengesfacing the processing of Arabic is the effect of the local context, which induces changesin the phonetic representation of a given text, thereby causing the recognition engineto misclassify it. The proposed solution is to develop a set of language-dependentgrapheme-to-allophone rules that can predict such allophonic variations and eventuallyprovide a phonetic transcription that is sensitive to the local context for the ASRsystem. The novel aspect of this method is that the pronunciation of each word is extracteddirectly from a context-sensitive phonetic transcription rather than a predefineddictionary that typically does not reflect the actual pronunciation of the word. Besidesinvestigating the boundary effect on pronunciation, the research also seeks to addressthe problem of Arabic's complex morphology. Two solutions are proposed to tackle thisproblem, namely, using underspecified phonetic transcription to build the system, andusing phonemes instead of words to build the Hidden Markov Models (HMMs). Theresearch also seeks to investigate several technical settings that might have an effect onthe system's performance. These include training on the sub-population to minimisethe variation caused by training on the main undifferentiated population, as well asinvestigating the correlation between training size and performance of the ASR system.
Additional digital content not deposited electronically:
none
Non-digital content not deposited electronically:
none
Thesis main supervisor(s):
Thesis advisor(s):
Funder(s):
Language:
en

Institutional metadata

University researcher(s):

Record metadata

Manchester eScholar ID:
uk-ac-man-scw:220338
Created by:
Alsharhan, Eiman T a e a
Created:
27th February, 2014, 23:21:27
Last modified by:
Alsharhan, Eiman T a e a
Last modified:
1st December, 2017, 09:07:38

Can we help?

The library chat service will be available from 11am-3pm Monday to Friday (excluding Bank Holidays). You can also email your enquiry to us.