Your browser does not support JavaScript!

Home    Search  

Results - Details

Search command : Author="Μουχτάρης"  And Author="Αθανάσιος"

Current Record: 25 of 29

Back to Results Previous page
Next page
Add to Basket
[Add to Basket]
Identifier 000385776
Title Sparse and low-rank techniques for robust speaker recognition and missing-features reconstruction
Alternative Title Τεχνικές αραιής και χαμηλής τάξης αναπαράστασης για εύρωστη αναγνώριση ομιλητή και ανακατασκευή ελλιπών χαρακτηριστικών
Author Τζαγκαράκης, Χρήστος
Thesis advisor Μουχτάρης, Αθανάσιος
Reviewer Στυλιανού, Ιωάννης
Τσακαλίδης, Παναγιώτης
Abstract Speaker recognition is the process of recognizing a speaker automatically, based on specific features extracted from the speech signal. A broad range of applications exploits at its core the process of speaker recognition, where usually the presence of environmental noise in the speech signal impedes the inference of correct decisions. An additional factor, which contributes to the difficulty of recognizing a speaker correctly, is the limited amount of available training and evaluation data. Focusing on overcoming the above limitations, this dissertation is divided in two main parts. In the first part, the problem of speaker recognition is reduced in an equivalent classification problem. To this end, we develop and study the performance of classification techniques, which are based on the framework of sparse representations, where we focus on the task of speaker identification by employing highly limited amounts of training and evaluation data, in environments with high levels of noise. The main assumption that governs these techniques is that the identified speech signal, and specifically the features that have been extracted from this signal, can be expressed as a sparse linear combination in terms of the columns of an overcomplete matrix, which is often referred in the literature with the term “dictionary”. The optimally estimated sparse weights of the linear combinations, the so-called sparse codes, which are obtained as the solutions of an optimization problem, are then employed for the final identification of the speaker based on a minimum reconstruction error criterion. Extending the previous classification method based on sparse representations, we study the efficiency of a method for discriminative dictionary learning. This method estimates jointly the dictionary comprising of the training data in conjunction with an appropriate linear classifier. The advantage of this approach is that it results in sparse codes, which are characterized by enhanced discriminative capability. Extensive comparisons with probabilistic models, which are based on the hypothesis that the extracted speech features follow a generalized Gaussian distribution, as well as with some of the state-of-the-art classification methods, such as Gaussian mixture models and joint factor analysis, revealed the superiority of the proposed method. The second part of this dissertation focuses on the use of low-rank techniques as a powerful tool for extracting reliable features from a speech signal. More specifically, a technique for recovering a low-rank matrix is designed, which is employed for the reconstruction of those spectral regions of a speech signal, which are unreliable due to the presence of noise. The reconstruction of the unreliable spectral regions is performed by adopting the Singular Value Thresholding (SVT) algorithm, based on the assumption that the logarithmic magnitude representation of a speech signal in the time-frequency domain, obtained via the short-time Fourier transform (STFT), is of low rank. The comparison against the widely used method of sparse imputation, which is based on sparse representations, reveals the superiority of our proposed approach in terms of producing more reliable features. Finally, we propose an extension of the matrix completion method, which exploits the prior knowledge that the data matrix is low rank, as well as the knowledge that the data can be represented efficiently in terms of a dictionary. In particular, we proposed an algorithm for joint low-rank representation and matrix completion (J-SVT). J-SVT is superior when compared with the standard SVT with respect to the computation of the low-rank representation of a data matrix in terms of a given dictionary, by employing a small number of observations from the original matrix. Through extensive simulations, we observed an improvement of the reconstruction error achieved by the J-SVT, in contrast to the typical SVT, for several distinct experimental scenarios.
Language English
Subject Completion
Dictionary learning
Low-rank matrix
Missing features
Sparse representation
Εκμάθηση λεξικού
Ελλιπή χαρακτηριστικά
Συμπλήρωση πίνακα
Issue date 2014-07-08
Collection   School/Department--School of Sciences and Engineering--Department of Computer Science--Doctoral theses
  Type of Work--Doctoral theses
Permanent Link https://elocus.lib.uoc.gr//dlib/6/6/e/metadata-dlib-1405081065-330927-21697.tkl Bookmark and Share
Views 548

Digital Documents
No preview available

Download document
View document
Views : 8