Latest

whatsapp (+234)07060722008
email sales@graciousnaija.com

Friday, 10 August 2018

REAL TIME SPEECH DRIVEN FACE ANIMATION

REAL TIME SPEECH DRIVEN FACE ANIMATION
Abstract
The goal of this project is to implement a system to analyse an audio signal containing speech, and produce a classification of lip shape categories (visemes) in order to synchronize the lips of a computer generated face with the speech. The thesis describes the work to derive a method that maps speech to lip movements, on an animated face model, in real time. The method is implemented in Matlab. The program reads speech from pre-recorded audio files and continuously performs spectral analysis of the speech. Neural net-works are used to classify the speech into a sequence of phonemes, and the corresponding visemes are shown on the screen. Some time delay between input speech and the visualization could not be avoided, but the overall visual impression is that sound and animation are synchronized.
Chapter one
Introduction
1.1 Background of the study
The human face is an extremely important communication channel. The face can express lots of information, such as emotions, intention or general condition of the person. In noisy environments, lip movements can compensate for a possible loss in speech signal. Moreover, the visual component of speech plays a key role for hearing impaired people. Besides the communication functions, the human face is primary element in human recognition. Composed of a complex structure of bones and muscles, it is extremely flexible and capable for various movements and face expressions. Such anatomical complexity accompanied by human sensitivity to discontinuities in simulated face movements, makes face animation one of the most difficult and challenging research areas in the computer animation. Virtual humans are graphical simulations of real or imaginary persons capable of human-like behaviour, most importantly talking and gesturing [1]. When integrated into an application, a virtual human representing a real human, brings life and personality, improves realism and in general provides a more natural interface. The rules of human behaviour, among others, imply speech and facial displays - in face-to face conversation among humans, both verbal and nonverbal communication takes part. For a realistic result, lip movements must be perfectly synchronized with the audio. Other than lip sync, realistic face animation includes facial displays. In our work we are interested in those facial displays that are not explicit emotional displays (i.e. expression such as smile) and also those which are not explicit verbal displays.
The goal of this project is to construct and implement a real time speech to face animation system. The program is based on the Visage Technologies [2] software. Neural networks are used to classify the incoming speech, and the program shows an animated face which mimics the sound. The animation is already implemented, so the work done in this thesis is focused on signal processing of an audio signal, and the implementation of speech to lip mapping and synchronization.
It is very important that the facial animation and sound are synchronized, which makes demands on the program considering the time delay. Some time delay must be accepted, since speech has to be spoken before it can be classified. The goal set for this thesis is 100 ms as the upper limit of delay from input speech to visualization.
1.2 Statement of the problem
The problems associated with the existing system include the following:
i.          There is no current recovery method for point loss
ii.          Feature points skewed in perspective create playback artifacts
iii.         There are only 22 feature points, which cannot fully describe a face
iv.        Initialization requires mouse-clicking the markers on the first frame
v.         The current algorithm is costly


REAL TIME SPEECH DRIVEN FACE ANIMATION

Chapters: 1 - 5
Delivery: Email
Number of Pages: 75

Price: 5000 NGN
In Stock


 

No comments:

Post a Comment

Add Comment