de.mpg.escidoc.pubman.appbase.FacesBean
Deutsch
 
Hilfe Wegweiser Impressum Kontakt Einloggen
  DetailsucheBrowse

Datensatz

DATENSATZ AKTIONENEXPORT

Freigegeben

Bericht

A nonlinear viseme model for triphone-based speech synthesis

MPG-Autoren
http://pubman.mpdl.mpg.de/cone/persons/resource/persons44069

Bargmann,  Robert
Computer Graphics, MPI for Informatics, Max Planck Society;

http://pubman.mpdl.mpg.de/cone/persons/resource/persons44144

Blanz,  Volker
Computer Graphics, MPI for Informatics, Max Planck Society;

http://pubman.mpdl.mpg.de/cone/persons/resource/persons45449

Seidel,  Hans-Peter
Computer Graphics, MPI for Informatics, Max Planck Society;

Externe Ressourcen
Es sind keine Externen Ressourcen verfügbar
Volltexte (frei zugänglich)

MPI-I-2007-4-003.ps
(beliebiger Volltext), 31MB

Ergänzendes Material (frei zugänglich)
Es sind keine frei zugänglichen Ergänzenden Materialien verfügbar
Zitation

Bargmann, R., Blanz, V., & Seidel, H.-P.(2007). A nonlinear viseme model for triphone-based speech synthesis (MPI-I-2007-4-003). Saarbrücken: Max-Planck-Institut für Informatik.


Zitierlink: http://hdl.handle.net/11858/00-001M-0000-0014-66DC-7
Zusammenfassung
This paper presents a representation of visemes that defines a measure of similarity between different visemes, and a system of viseme categories. The representation is derived from a statistical data analysis of feature points on 3D scans, using Locally Linear Embedding (LLE). The similarity measure determines which available viseme and triphones to use to synthesize 3D face animation for a novel audio file. From a corpus of dynamic recorded 3D mouth articulation data, our system is able to find the best suited sequence of triphones over which to interpolate while reusing the coarticulation information to obtain correct mouth movements over time. Due to the similarity measure, the system can deal with relatively small triphone databases and find the most appropriate candidates. With the selected sequence of database triphones, we can finally morph along the successive triphones to produce the final articulation animation. In an entirely data-driven approach, our automated procedure for defining viseme categories reproduces the groups of related visemes that are defined in the phonetics literature.