Your browser does not support JavaScript!

Doctoral theses

Search command : Author="Παπαγιαννάκης"  And Author="Γεώργιος"

Current Record: 23 of 95

Back to Results Previous page
Next page
Add to Basket
[Add to Basket]
Identifier 000402821
Title Intelligibility enhancement of casual speech based on clear speech properties
Alternative Title Αύξηση της καταληπτότητας της ομιλίας χρησιμοποιώντας ιδιότητες καταληπτής ομιλίας
Author Κουτσογιαννάκη, Μαρία Χαραλάμπους
Thesis advisor Στυλιανού, Γιάννης
Reviewer Μουχτάρης, Αθανάσιος
Hazan, Vazan
Abstract In adverse listening conditions (e.g. presence of noise, hearing - impaired listener etc.) people adjust their speech in order to overcome the communication difficulty and successfully deliver their message. This remarkable adjustment produces different speaking styles compared to unobstructed speech (casual speech) that vary among speakers and conditions, but share a common characteristic; high intelligibility. Developing algorithms that exploit acoustic features of intelligible human speech could be beneficial for speech technology applications that seek methods to enhance the intelligibility of “speaking - devices”. Besides the commercial orientation (e.g., mobile telephone, GPS, customer service systems) of these applications, most important is their medical context, providing assistive communication to people with speech or hearing deficits. However, current speech technology is deaf, meaning that it cannot adjust, like humans do, to the dynamically changing real environments or to the listener’s specificity. This work proposes signal modifications based on the acoustic properties of a high intelligible human speaking style, the clear speech, assisting in the development of smart speech technology systems that “mimic” the way people produce intelligible speech. Unlike other speaking styles, clear speech has a high intelligibility impact on various listening populations (native and non - native listeners, hearing impaired, cochlear implant users, elderly people, people with learning disabilities etc.) in many listening conditions (quiet, noise, reverberation). A significant part of this work is devoted to the comparative analysis between casual and clear speech, which reveals differences on prosody, vowel spaces, spectral energy and modulation depth of the temporal envelopes. Based on these observed and measured differences between the two speaking styles, we propose modifications for enhancing the intelligibility of casual speech. Compared to other state - of - the - art modification systems, our modification techniques (1) do not require excessive computation (2) are speaker and speech independent (3) maintain speech quality (4) are explicit, since they do not require statistical training and the preexistence of clear speech recordings. Evaluations on intelligibility and quality are performed objectively using recently proposed objective intelligibility scores and subjectively with listening tests conducted by native and non native listeners in noisy environments (speech shaped noise, SSN), reverberation and in quiet. Results show that our modifications enhance speech intelligibility in SSN and reverberation for native and non - native listeners. Specifically, the proposed spectral modification technique, namely Mix - filtering, increases the intelligibility of speech in noise and reverberation while maintains the quality of the original signal, unlike other intelligibility boosters. Moreover, a modulation depth enhancement technique called DMod, increases speech intelligibility more than 30% in SSN. DMod algorithm is inspired by both clear speech properties and by the non - linear phenomena that take place in the basilar membrane. DMod not only achieves to enhance speech intelligibility, but it introduces a novel method for manipulating the modulation spectrum of the signal. Results of this study indicate a connection of the modulations of the temporal envelopes with speech perception and specifically with processes that take place on the basilar membrane of human ear and pave the way for analyzing and comprehending speech in terms of modulations.
Language English
Subject Casual
Lombard
Modulations
Noise
Διαμορφώσεις
Θόρυβος
Καταληπτότητα
Ομιλία
Issue date 2016-04-05
Collection   Faculty/Department--Faculty of Sciences and Engineering--Department of Computer Science--Doctoral theses
  Type of Work--Doctoral theses
Permanent Link https://elocus.lib.uoc.gr//dlib/4/4/3/metadata-dlib-1473927185-141698-3407.tkl Bookmark and Share
Views 220

Digital Documents
No preview available

View document
Views : 14