E-Locus - Institutional Repository of the University of Crete - Selection of Relevant Features for Audio Classification tasks

Home Selection of Relevant Features for Audio Classification tasks

Results - Details

[Add to Basket]

Identifier

000370926

Title

Selection of Relevant Features for Audio Classification tasks

Alternative Title

Επιλογή σχετικών χαρακτηριστικών για την ταξινόμηση ηχητικών σημάτων

Author

Μαρκάκη, Μαρία Γεώργιος

Thesis advisor

Στυλιανού, Γιάννης

Abstract

Advances in time-frequency distributions and spectral analysis techniques (i.e., for the estimation of amplitude and/or frequency modulations) allow a better representation of non-stationary signals like speech, highlighting their fine structure and dynamics. Although such representations are very useful for analysis purposes, they complicate the classification tasks due to the large number of parameters extracted from the signal (“curse of dimensionality”). For such tasks, a significant dimensionality reduction is required. In this thesis, the problem of dimensionality reduction of these time/frequency-frequency representations is studied; selection criteria of the optimal parameters are suggested, based on their relevance to a given classification task. Relevance is defined based on mutual information. First, using tools from multilinear algebra, such as High Order SVD, the initial dimensions and the noise components of the representation are reduced. Then, feature selection proceeds based on maximum relevance criterion. It is shown that the suggested process is equivalent to the maximum dependency criterion for feature selection, without, however, the need of the multivariate probability densities estimation. The feature selection approach suggested in the thesis is applied on a number of audio classification tasks, including speech detection in broadcast news and voice pathology detection and discrimination from vowel recordings. The complementarity of the modulation spectral features to the state-of-the-art Mel frequency cepstral coefficients is shown for the above classification tasks. A system for the automatic discrimination of pathological heart murmurs using a high resolution time-frequency analysis of the phonocardiogram (PCG) is also presented. The classification accuracy of the system is comparable to the diagnostic accuracy of experienced paedo-cardiologists on the same PCG dataset.

Language

English

Subject

Amplitude Modulation

Fourier Transform

Higher Order Singular Value Decomposition

Mutual Information

Αμοιβαία πληροροφορία

Διάσπαση ιδιόμορφων τιμών υψηλότερης τάξης

Διαμόρφωση πλάτους

Μετασχηματισμός Fourier

Issue date