de.mpg.escidoc.pubman.appbase.FacesBean
English
 
Help Guide Disclaimer Contact us Login
  Advanced SearchBrowse

Item

ITEM ACTIONSEXPORT

Released

Conference Paper

AV Processing in eHumanities - a paradigm shift

MPS-Authors
http://pubman.mpdl.mpg.de/cone/persons/resource/persons216

Wittenburg,  Peter
Technical Group, MPI for Psycholinguistics, Max Planck Society;

http://pubman.mpdl.mpg.de/cone/persons/resource/persons4464

Lenkiewicz,  Przemyslaw
The Language Archive, MPI for Psycholinguistics, Max Planck Society;

http://pubman.mpdl.mpg.de/cone/persons/resource/persons4

Auer,  Eric
The Language Archive, MPI for Psycholinguistics, Max Planck Society;

http://pubman.mpdl.mpg.de/cone/persons/resource/persons4454

Gebre,  Binyam Gebrekidan
The Language Archive, MPI for Psycholinguistics, Max Planck Society;

http://pubman.mpdl.mpg.de/cone/persons/resource/persons37858

Lenkiewicz,  Anna
The Language Archive, MPI for Psycholinguistics, Max Planck Society;

http://pubman.mpdl.mpg.de/cone/persons/resource/persons39383

Drude,  Sebastian
The Language Archive, MPI for Psycholinguistics, Max Planck Society;

Locator
There are no locators available
Fulltext (public)

Wittenburg_HamburgUP_dh2012_BoA.pdf
(Publisher version), 140KB

Supplementary Material (public)
There is no public supplementary material available
Citation

Wittenburg, P., Lenkiewicz, P., Auer, E., Gebre, B. G., Lenkiewicz, A., & Drude, S. (2012). AV Processing in eHumanities - a paradigm shift. In J. C. Meister (Ed.), Digital Humanities 2012 Conference Abstracts. University of Hamburg, Germany; July 16–22, 2012 (pp. 538-541).


Cite as: http://hdl.handle.net/11858/00-001M-0000-000F-47EE-9
Abstract
Introduction Speech research saw a dramatic change in paradigm in the 90-ies. While earlier the discussion was dominated by a phoneticians’ approach who knew about phenomena in the speech signal, the situation completely changed after stochastic machinery such as Hidden Markov Models [1] and Artificial Neural Networks [2] had been introduced. Speech processing was now dominated by a purely mathematic approach that basically ignored all existing knowledge about the speech production process and the perception mechanisms. The key was now to construct a large enough training set that would allow identifying the many free parameters of such stochastic engines. In case that the training set is representative and the annotations of the training sets are widely ‘correct’ we could assume to get a satisfyingly functioning recognizer. While the success of knowledge-based systems such as Hearsay II [3] was limited, the statistically based approach led to great improvements in recognition rates and to industrial applications.