Your browser does not support JavaScript!

Home    Text-independent speaker identification using sparsely excited speech signals and compressed sensing  

Results - Details

Add to Basket
[Add to Basket]
Identifier 000361529
Title Text-independent speaker identification using sparsely excited speech signals and compressed sensing
Alternative Title Αναγνώριση ομιλητή ανεξάρτητη από το κείμενου με τη βοήθεια αραιών σημάτων και της θεωρίας συμπιεστικής δειγματοληψίας
Author Καραμιχάλη, Ελένη Ηλία
Thesis advisor Μουχτάρης, Αθανάσιος
Abstract Compressed Sensing (CS) is an emerging theory that claims that the Nyquist sampling theorem yields for more samples than necessary. According to the Nyquist sampling theorem, the sampling rate of a signal must be at least equal to the double of its maximum frequency. On the contrary, CS seeks to represent a signal using a small number of linear, non-adaptive measurements which are far less than the signal’s bandwidth. Thus, CS accomplishes both compression and sampling in one low-complexity step. The only requirement for CS to be efficient is that the signal is sparse in some basis, which means it has only a few non zero elements in some basis. Compressed sensing has been used for full signal reconstruction, but in our case it was used for feature recovery in order to perform text-independent speaker identification. Speaker identification is the act of recognizing a speaker under the condition that he is a part of a database which has been modeled beforehand using features extracted from each speaker’s training set. Specifically, we trained a Gaussian Mixture Model for each speaker in the database, using Line Spectral Frequencies. Text-independent speaker identification means that the testing speech signals were not included in the training phase. We chose to use CS theory for speaker identification for two reasons. The first one is that CS theory requires just a few samples to reconstruct a signal and this is very useful in environments like sensor networks where there are limitations in the data traffic that can be sent between the sensor nodes. Thus, although traffic is limited, we are still able to avoid information loss. The second reason is that CS algorithms are robust to noise. These algorithms force the signals to be sparse in some basis which results in neglecting noisy samples that have low energy. After experimenting with some CS algorithms for signal reconstruction, we decided to use Orthogonal Matching Pursuit for our research because of its low complexity and the lowest feature distortion after the reconstruction. The results may not be as good as the ones using features extracted from the original speech signals, but they are quite good regarding the number of samples that were used, and are very promising for future investigation and research.
Language English
Subject Compressed sensing
Sparse signals
Speaker identification
Αναγνώριση ομιλητή
Αραιά σήματα
Συμπιεστική δειγματοληψία
Issue date 2010-11-19
Collection   School/Department--School of Sciences and Engineering--Department of Computer Science--Post-graduate theses
  Type of Work--Post-graduate theses
Views 476

Digital Documents
No preview available

Download document
View document
Views : 17