Your browser does not support JavaScript!

Home    Collections    Type of Work    Technical reports  

Technical reports

Current Record: 11 of 13

Back to Results Previous page
Next page
Add to Basket
[Add to Basket]
Identifier uch.csd.tcl//2000panagiotakis
Title Τμηματοποίηση ήχου και κατηγοριοποίηση σε μουσική και ομιλία
Alternative Title A Speech/Music Discriminator based on RMS and zero-crossings
Author Παναγιωτάκης, Κώστας
Author Τζιρίτας, Γιώργος
Abstract Over the last years major efforts have been made to develop methods for extracting information from audio-visual media, in order that they may be stored and retrieved in databases automatically, based on their content. In this work we deal with the characterization of an audio signal, which may be part of a larger audiovisual system or may be autonomous, as for example in the case of an audio recording stored digitally on disk. Our goal was to first develop a system for segmentation of the audio signal, and then classification into one of two main categories: speech or music. Among the system's requirements are its processing speed and its ability to function in a real time environment. Because of the restriction to two classes, the characteristics that are extracted are considerably reduced and moreover the required computations are straightforward. Experimental results show that efficiency is exceptionally good, without sacrificing performance. Segmentation is based on mean signal amplitude distribution, whereas classification utilizes an additional characteristic related to the frequency. The classification algorithm may be used either in conjunction with the segmentation algorithm, in which case it verifies or refutes a music-speech or speech-music change, or autonomously, with given audio segments. The basic characteristics are computed in 20 msec intervals, resulting in the segments' limits being specified within an accuracy of 20 msec. The smallest segment length is one second. The segmentation and classification algorithms were benchmarked on a large data set, with correct segmentation about 97% of the time and correct classification about 95%.
Language Greek
Issue date 2000-11-24
Collection   School/Department--School of Sciences and Engineering--Department of Computer Science--Technical reports
  Type of Work--Technical reports
Permanent Link Bookmark and Share
Views 650

Digital Documents
No preview available

Download document
View document
Views : 10