Your browser does not support JavaScript!

Home    On the inverse filtering of speech  

Results - Details

Add to Basket
[Add to Basket]
Identifier 000361998
Title On the inverse filtering of speech
Alternative Title Αντίστροφο φιλτράρισμα σημάτων φωνής
Author Καφεντζής, Γεώργιος Παναγιώτη
Thesis advisor Στυλιανού, Ιωάννης
Abstract In all proposed source-filter models of speech production, Inverse Filtering (IF) is a well known technique for obtaining the glottal flow waveform, which acts as the source in the vocal tract system. The estimation of glottal flow is of high interest in a variety of speech areas, such as voice quality assessment, speech coding and synthesis as well as speech modifications. A major obstacle in comparing and/or suggesting improvements in the current state of the art approaches is simply the lack of real data concerning the glottal flow. In other words, the results obtained from various inverse filtering algorithms, cannot be directly evaluated because the actual glottal flow waveform is simply unknown. To this direction, suggestions on the use of synthetic speech that has been created using artificial glottal waveform are widely used in the literature. This kind of evaluation, however, is not truly objective because speech synthesis and IF are typically based on similar models of the human voice production apparatus, in our case, the traditional source-filter model. This thesis presents three well-known IF methods based on Linear Prediction Analysis (LPA), and a new method, and its performance is compared to the others. The first one is based on the conventional autocorrelation LPA, and the second one on the conventional closed phase covariance LPA. The closed phase is identified using Plumpe and Quatieri’s suggested method based on using statistics on the first formant frequencies during a pitch period. The third one is based the work of Alku et al, which proposed an IF method based on a Mathematically Constrained Closed Phase Covariance LPA, in which mathematical constraints are imposed on the conventional covariance analysis. This results in more realistic root locations of the model on the z-plane. Finally, Magi et al suggested a new method for extracting the vocal tract filter, called Stabilized Weighted LP Analysis (SWLP), in which a short time energy window controls the performance of the LP model. This method is suggested for IF due to its interesting property of applying emphasis on speech samples which typically occur during the closed phase region of the speech signal. This is expected to yield a more robust, in the acoustic sense, vocal tract filter estimate than the conventional autocorrelation LP. The three IF approaches along with the suggested new one are applied on a database of physically modeled speech signals. In this case, the glottal flow and the speech signal are available and direct evaluation of IF methods can be performed. Robust time and frequency parametrization measures are applied on both the actual glottal flow and the estimated ones, in order to evaluate the performance of the methods. These measures include the Normalized Amplitude Quotient (NAQ), the difference between the first two harmonics (H1-H2) of the glottal spectrum, and the Harmonic Richness Factor (HRF), along with the Signal to Reconstruction Error ratio (SRER). Experiments conducted on physically modeled sustained vowels (/aa/, /ae/, /eh/, /ih/) of a wide range of frequencies (105 to 255 Hz) for both male and female speech. Glottal flow estimates were produced, using short time pitch synchronous analysis and synthesis for the covariance based methods, whereas for the autocorrelation methods, a long analysis window and a short synthesis window was used. The results and measures are compared and discussed, showing the prevalence of the covariance methods, but the suggested method typically produces better results than the conventional autocorrelation LP, according to our metrics.
Language English
Subject Inverse filtering
Linear prediction
Speech analysis
Speech processing
Ανάλυση φωνής
Αντίστροφο φιλτράρισμα
Γραμμική πρόβλεψη
Επεξεργασία φωνής
Issue date 2010-11-19
Collection   School/Department--School of Sciences and Engineering--Department of Computer Science--Post-graduate theses
  Type of Work--Post-graduate theses
Views 454

Digital Documents
No preview available

Download document
View document
Views : 21