Les lèvres parlantes / IMIMA


Image(s): 640*480

Jpeg Image (18 Ko)



Project : Synthèse de visages parlants

  • URL :

    Video(s) and extracted images: 320*240

    Video QuickTime -> Film/Video (2.2 Mo)
    Jpeg Images -> (13 Ko)

    Video QuickTime -> Film/Video (3.0 Mo)
    Jpeg Images -> (11 Ko) (8 Ko)


    Méthode d'analyse - synthèse de lèvres et visages parlants

    Technical Information

    Analyse d'image, synthèse d'image, synchronisation audio, naturel-visage synthétique

    More Information...

    • Bibliography :

      cf Actes Imagina 1994, p144-163 : Perception, synthèse et analyse des lèvres parlantes Imagina's Proceedings:"Perception, Analysis and Synthesis of Talking Lips" Authors: Christian Benoit, Ali Adjoudani, Omar Angola, Thierry Guiard-Marigny, & Bertrand Le Goff , pp 144-163, 1994

    • Abstract :

      A virtual actor can only aspire to "anthropomorphic quality" if his/her lip movements in particular, and facial movements in general, are coherent with the acoustic message supposedly being produced. Indeed, the auditive modality dominates in perception of the spoken word for those with normal hearing, but the visual modality enhances understanding of the spoken word. Although visual information supplied by movements of the lips, chin, teeth, cheeks, etc. is in itself insufficient to render the spoken word intelligible, sight of the speaker's face allows the "restoration", via natural compensation, of a large part of the oral information which is missing under degraded acoustic transmission conditions. We quantified the increased intelligibility provided by visual information when the spoken word is degraded by noise. Our test conditions included natural or synthetic spoken words synchronized to match a natural face, or different parts of a synthetic face (3D lip models from the Institut de la Communication Parlée, and the Parke face). The most characteristic anatomic and geometric parameters involved in production and perception of the visual word were identified through multidimensional analysis of a very large corpus of spoken French. Software designed for automatic extraction of these parameters was set up on an image capture and processing post. In parallel, a parametric model of high resolution three dimensional lips was elaborated, then set up on a computer graphics post. A control interface was likewise developed, to allow activation of the articulatory commands of the Parke facial model (1974), as modified by Cohen (1993), from automatic measurements performed on the speaker's face. These two models were evaluated in terms of intelligibility conferred on the naturally degraded spoken word. Thus, our synthetic lip model, devoid of teeth, tongue or jaw, and accomplishing movements controlled by a flow-rate of just a few bits per second, allows transmission of more than one third of the information provided by viewing the natural face of the reference speaker (i.e. a model ). Finally, a full chain of analysis/synthesis of the speaking face was developed at the Institut de la Communication Parlée, allowing real-time visual speech cloning between two distant machines. A system of this kind is particularly suitable for real-time visual animation of characters, providing high quality synchronization of labial movements. A real-time demonstration resulting from a collaboration between Medialab (Paris) and the Institut de la Communication Parlée (Grenoble) will also present the ICP analysis system, allowing remote control of a synthetic face developed by Medialab.

    • Some external links :

      (oo) Publications en analyse/synthèse de parole, acoustique et visuelle
    • Some more Comments :

      ces informations proviennent d'un fax de C.Benoît.

  • Copyright © 1994-2024
    Other Sites : | Ai Girls | Ai Creations