Results - Details
Search command : Author="Μουχτάρης"
And Author="Αθανάσιος"
Current Record: 29 of 29
|
Identifier |
000361529 |
Title |
Text-independent speaker identification using sparsely excited speech signals and compressed sensing |
Alternative Title |
Αναγνώριση ομιλητή ανεξάρτητη από το κείμενου με τη βοήθεια αραιών σημάτων και της θεωρίας συμπιεστικής δειγματοληψίας |
Author
|
Καραμιχάλη, Ελένη Ηλία
|
Thesis advisor
|
Μουχτάρης, Αθανάσιος
|
Abstract |
Compressed Sensing (CS) is an emerging theory that claims that the Nyquist sampling theorem yields for more samples than necessary. According to the Nyquist sampling theorem, the sampling rate of a signal must be at least equal to the double of its maximum frequency. On the contrary, CS seeks to represent a signal using a small number of linear, non-adaptive measurements
which are far less than the signal’s bandwidth. Thus, CS accomplishes both
compression and sampling in one low-complexity step. The only requirement for CS to be efficient is that the signal is sparse in some basis, which means it has only a few non zero elements in some basis.
Compressed sensing has been used for full signal reconstruction, but in
our case it was used for feature recovery in order to perform text-independent speaker identification. Speaker identification is the act of recognizing a speaker under the condition that he is a part of a database which has been modeled beforehand using features extracted from each speaker’s training set. Specifically, we trained a Gaussian Mixture Model for each speaker in the database, using Line Spectral Frequencies. Text-independent speaker identification means that the testing speech signals were not included in the training phase.
We chose to use CS theory for speaker identification for two reasons. The
first one is that CS theory requires just a few samples to reconstruct a signal
and this is very useful in environments like sensor networks where there are
limitations in the data traffic that can be sent between the sensor nodes.
Thus, although traffic is limited, we are still able to avoid information loss.
The second reason is that CS algorithms are robust to noise. These algorithms force the signals to be sparse in some basis which results in neglecting noisy samples that have low energy.
After experimenting with some CS algorithms for signal reconstruction, we
decided to use Orthogonal Matching Pursuit for our research because of its
low complexity and the lowest feature distortion after the reconstruction.
The results may not be as good as the ones using features extracted from
the original speech signals, but they are quite good regarding the number of
samples that were used, and are very promising for future investigation and
research.
|
Language |
English |
Subject |
Compressed sensing |
|
Sparse signals |
|
Speaker identification |
|
Αναγνώριση ομιλητή |
|
Αραιά σήματα |
|
Συμπιεστική δειγματοληψία |
Issue date |
2010-11-19 |
Collection
|
School/Department--School of Sciences and Engineering--Department of Computer Science--Post-graduate theses
|
|
Type of Work--Post-graduate theses
|
Permanent Link |
https://elocus.lib.uoc.gr//dlib/f/c/f/metadata-dlib-309047b4d51c66e2121bd4e9b38597e9_1287998863.tkl
|
Views |
537 |