Your browser does not support JavaScript!

Post-graduate theses

Current Record: 45 of 824

Back to Results Previous page
Next page
Add to Basket
[Add to Basket]
Identifier 000452055
Title A neural-based sinusoidal vocoder
Alternative Title Ένας νευρωνικός ημιτονοειδής φωνοκωδικοποιητής
Author Ραπτάκης, Μιχαήλ Γ.
Thesis advisor Στυλιανού, Ιωάννης
Reviewer Κομοντάκης, Νικόλαος
Πανταζής, Ιωάννης
Abstract The new era of voice encoding is entirely dominated by neural network models capable of producing natural-sounding synthetic speech, undoubtedly superior compared to all previous parametric methods. However, their exceptionally high quality comes at the cost of being spatially large and computationally demanding. Additionally, despite taking into account some statistical traits of speech signals, most state-of-the-art architectures rarely consider fundamental and well-studied characteristics or methodologies discovered by speech processing literature of the past. In this work, instead of directly synthesizing speech signals by solely using the “raw power” of neural networks, the aim is to take advantage of speech’s quasiperiodic and sinusoidal properties to show how a modern neural-based vocoder can generate speech based on a sinusoidal representation. Using MelGAN as our starting vocoder model due to its renowned speed and quality, we extend it by adding layers that, instead of directly outputting the speech waveform itself, estimate the amplitudes and phases of a new proposed sinusoidal representation. Our results show that the produced quality is on par with the original MelGAN model in terms of MOS scores, indicating that this novel and less expensive approach is indeed feasible. We further experiment with these models and broach the difficulty of finding a multi-resolution spectral loss able to produce quality up to the standards of adversarially-trained models.
Language English
Subject Deep learning
Neural networks
Βαθιά μάθηση
Νευρωνικά δίκτυα
Issue date 2022-12-02
Collection   School/Department--School of Sciences and Engineering--Department of Computer Science--Post-graduate theses
  Type of Work--Post-graduate theses
Permanent Link https://elocus.lib.uoc.gr//dlib/b/0/0/metadata-dlib-1667906572-550601-12721.tkl Bookmark and Share
Views 482

Digital Documents
No preview available

No permission to view document.
It won't be available until: 2025-12-02