Your browser does not support JavaScript!

Home    Search  

Results - Details

Search command : Author="Μουχτάρης"  And Author="Αθανάσιος"

Current Record: 2 of 29

Back to Results Previous page
Next page
Add to Basket
[Add to Basket]
Identifier 000417143
Title DRACOSS : a framework for direction of arrival estimation and counting of multiple sound sources with microphone arrays
Alternative Title DRACOSS: ένα ολοκληρωμένο πλαίσιο για εκτίμηση πλήθους και κατεύθυνσης άφιξης πολλαπλών ηχητικών πηγών με συστοιχίες μικροφώνων
Author Παυλίδη, Δέσποινα Απόστολος
Thesis advisor Μούχταρης, Αθανάσιος
Reviewer Τσακαλίδης, Παναγιώτης
Pulkki, Ville
Abstract Technological advances have infiltrated our everyday life more than ever before. High intelligence devices and gadgets, equipped with cutting-edge technology algorithms, facilitate and empower our lifestyle. Smart-home automation, next generation hearing aids, robots with autonomous navigation systems have brought to the foreground of the research community audio signal processing problems. One such problem is the estimation of the number of sources and the directions from which sound originates, what we most frequently call direction of arrival (DOA) estimation. The problem of DOA estimation is active for more than thirty years, consequently a plethora of algorithms have been proposed in the literature. Some of them can be considered classic and frequently come from the telecommunications research area. Beamforming techniques belong in this category, where an appropriately weighted sum of the signals of a microphone array is used to forma receiving beam, which scans the space and detects areas of activity. Subspace approaches, such as the well-known MUSIC algorithm, formulate a spatial function that gets maximized when activity is detected, relying on the decomposition of the array sample covariance matrix. Other algorithms stemmed from research activity on blindly separating mixtures of audio signals, i.e., the blind source separation (BSS) problem. Independent component analysis methods, where the goal is to estimate a demixing matrix, which reveals DOA information, and sparse component analysis methods, which exploit the sparsity of activity of the sources in some appropriately chosen domain, both fall into the BSS category. A recently emerging category is that of estimating the intensity vector, which points towards the net flow of sound energy, hence, revealing the corresponding DOA of the generating sound source. The aforementioned methods fail at either estimating accurately DOAs when multiple sources are simultaneously active, e.g., beamforming techniques, or they are computationally heavy and significantly affected by the amount of available data, e.g., ICA and subspace approaches, while some are restricted by specific array geometries. We, thus, observe the lack of a methodology than can address the problem of DOA estimation holistically, aiming at tackling all aforementioned aspects of the problem. In this thesis we aim at filling this gap with our proposed DRACOSS framework, i.e., an integrated framework for tackling the problem of DOA estimation and counting of multiple, simultaneously active, sound sources utilizing microphone arrays. DRACOSS is developed in two-dimensional (2D) and three-dimensional (3D) spaces, using a uniformcircular array and a spherical microphone array respectively. DRACOSS constitutes a procedure of four distinct steps: (a) exploitation of the sparsity of sound signals, (b) local singlesourceDOA estimation, (c) histogramformation, and (d) post-processing of the histogram. We detect the sparsity of involved sound signals in the time-frequency domain by utilizing a relaxed sparsity assumption, which relies on the estimation of a mean correlation coefficient between pairs of microphones. We proceed with the collection of local DOA estimates in detected single-activity areas, which will then be used to form histograms. For the 2D case we employ a local DOA estimator, designed specifically for circular arrays and form one-dimensional histograms. For the 3D case we use an intensity vector estimator and then form two-dimensional histograms. In both cases, by post-processing the histograms we provide counting and DOA estimation results for all active sound sources. DRACOSS performs robustly under a wide collection of simulated and real scenarios in terms of noise and reverberation conditions, in terms of the number of simultaneously active sources and in comparison with state-of-the-art methods. We also propose the formulation of two classic DOA methods, i.e., beamforming and MUSIC, through the DRACOSS framework, which manages to significantly improve their performance. Aiming at constantly improving our approach and following the vivid technological stream, we show recent, very promising results on counting by utilizing deep neural networks.
Language English
Subject Histogram processing
Sound intensity vector
Sparsity
Spherical harmonic domain
Spherical microphone arrays
Time-frequency domain
Αραιότητα
Διάνυσμα ηχητικής έντασης
Επεξεργασία ιστογραμμάτων
Σφαιρικές συστοιχίες μικροφώνων
Χώρος σφαιρικών αρμονικών συνιστωσών
Χώρος χρονο-συχνοτήτων
Issue date 2018-07-20
Collection   Faculty/Department--Faculty of Sciences and Engineering--Department of Computer Science--Doctoral theses
  Type of Work--Doctoral theses
Permanent Link https://elocus.lib.uoc.gr//dlib/f/0/a/metadata-dlib-1531378806-10323-5366.tkl Bookmark and Share
Views 218

Digital Documents
No preview available

View document
Views : 4