Abstract |
Human speech is a multi-dimensional signal consisting of acoustical and optical components. The presence of both contributes to better quality in communication. Α very important parameter of the optical component is mouth movement since everybody has the ability to understand that the mouth movement of the person who talks, is synchronized with speech. Given that only the acoustical part of speech is present, we developed four different approaches of a lip-sync algorithm which converts human speech to mouth movements. All of the approaches are based on the Linear Prediction method of analysis, which is commonly used in sound processing. The total number of distinct mouth positions (visemes) that we used is eight and they correspond to the most distinct positions of the mouth during the speech. The input in every approach of the algorithm is a speech signal and the output a sequence of visemes. We analyze the need for smoothing the sequence of visemes in order to improve the realism of the output results, and we present smoothing methods knowing at each instant, a number of the following visemes In the algorithms first approach we correspond one viseme per 20 ms frame (Frame) of speech signal. In the second approach we correspond one viseme per 40 ms frame (Big Frame) of speech signal. In the third approach we regard every following viseme to be known and we smooth the sequence of visemes according to a heuristic algorithm. Finally in the fourth approach we consider we know up to four following visemes so we use a more complex heuristic algorithm with more smoothing rules for the sequence of visemes. The first three approaches can be implemented in real time. Τhe last approach gives the best lip-sync results. In all of the approaches we use the energy and the number of zero-crossings from the time domain and the linear prediction smooth spectrum from the frequency domain. All of the lip-sync approaches were implemented in a MATLAB application, which presents the results in the form of a talking face on users screen. Every new user that wants to use our application, have to follow a short and easy training procedure. Finally more vivid applications are presented, in which our MATLAB application is combined with 3d modeling and animation programs.
|