Your browser does not support JavaScript!

Home    Assessing the quality of audio in musical concert recordings using deep neural networks  

Results - Details

Add to Basket
[Add to Basket]
Identifier 000425807
Title Assessing the quality of audio in musical concert recordings using deep neural networks
Alternative Title Εκτίμηση ποιότητας ηχογραφήσεων από μουσικές συναυλίες με χρήση τεχνικών βαθιάς μάθησης
Author Σίμου, Νίκων Χ.
Thesis advisor Τσακαλίδης, Παναγιώτης
Reviewer Στεφανάκης, Νίκος
Δημητρόπουλος, Ξενοφώντας
Πανταζής, Γιάννης
Abstract The era in which we live in can be indisputably characterized by the enormous flow of multimedia information. Using portable multimedia devices such as drones and smartphones we are able to capture every moment of our lives and of the public events that we attend. A large proportion of audiovisual recordings from these events becomes available through the social media and the large number of websites which provide video and audio content. The availability of such massive amount of User Generated Recordings (UGRs) has triggered new research directions related to the search, organization and management of this content. In this Thesis, we use Deep Neural Networks (DNN) in order to create a tool to automatically assess the audio quality of musical concert recordings that users upload on multimedia platforms such as YouTube. It is well known that DNNs require a lot of training samples, which means that one would need an enormous amount of time in order to listen and to assign a subjective quality score to each audio sample. We tackle this problem by treating quality assessment as a binary classification problem where class 0 consist of the set of UGRs from a certain event and class 1 consist of the professional quality recordings from the same event. Furthermore, we use an automatic synchronization process in order to match every UGR with its corresponding segment from a professional quality recording, which assists in making the process invariant to audio content. Experiments produced with different DNN architectures and acoustic feature are presented, showing that the UGR class can be discriminated from the professional quality class with a high accuracy.
Language English
Subject Audio processing
Deep learning
Βαθειά μάθηση
Επεξεργασία ήχου
Issue date 2019-11-22
Collection   School/Department--School of Sciences and Engineering--Department of Computer Science--Post-graduate theses
  Type of Work--Post-graduate theses
Views 495

Digital Documents
No preview available

Download document
View document
Views : 5