Deutsch
 
Hilfe Datenschutzhinweis Impressum
  DetailsucheBrowse

Datensatz

DATENSATZ AKTIONENEXPORT

Freigegeben

Meeting Abstract

Perception of accentuation in audio-visual speech

MPG-Autoren
/persons/resource/persons84115

Nusseck,  M
Department Human Perception, Cognition and Action, Max Planck Institute for Biological Cybernetics, Max Planck Society;
Max Planck Institute for Biological Cybernetics, Max Planck Society;

/persons/resource/persons83870

Cunningham,  DW
Department Human Perception, Cognition and Action, Max Planck Institute for Biological Cybernetics, Max Planck Society;
Max Planck Institute for Biological Cybernetics, Max Planck Society;

/persons/resource/persons84298

Wallraven,  C
Department Human Perception, Cognition and Action, Max Planck Institute for Biological Cybernetics, Max Planck Society;
Max Planck Institute for Biological Cybernetics, Max Planck Society;

/persons/resource/persons83839

Bülthoff,  HH
Department Human Perception, Cognition and Action, Max Planck Institute for Biological Cybernetics, Max Planck Society;
Max Planck Institute for Biological Cybernetics, Max Planck Society;

Externe Ressourcen
Es sind keine externen Ressourcen hinterlegt
Volltexte (beschränkter Zugriff)
Für Ihren IP-Bereich sind aktuell keine Volltexte freigegeben.
Volltexte (frei zugänglich)
Es sind keine frei zugänglichen Volltexte in PuRe verfügbar
Ergänzendes Material (frei zugänglich)
Es sind keine frei zugänglichen Ergänzenden Materialien verfügbar
Zitation

Nusseck, M., Cunningham, D., Wallraven, C., & Bülthoff, H. (2006). Perception of accentuation in audio-visual speech. In 2nd Enactive Workshop at McGill University.


Zitierlink: https://hdl.handle.net/11858/00-001M-0000-0013-D20F-9
Zusammenfassung
Introduction:
In everyday speech, auditory and visual information are tightly coupled. Consistent with this, previous research has shown that facial and head motion can improve the intelligibility of speech (Massaro et al., 1996; Munhall et al., 2004; Saldana Pisoni 1996). The multimodal nature of speech is particularly noticeable for emphatic speech, where it can be exceedingly difficult to produce the proper vocal stress patterns without producing the accompanying facial motion. Using a detection task, Swerts and Krahmer (2004) demonstrated that information about which word is emphasized exists in both the visual and acoustic modalities. It remains unclear as to what the differential roles of visual and auditory information are for the perception of emphasis intensity. Here, we validate a new methodology for acquiring, presenting, and studying verbal emphasis. Subsequently, we can use the newly established methodology to explore the perception and production of believable accentuation.
Experiment:
Participants were presented with a series of German sentences, in which a single word was emphasized. For each of the 10 base sentences, two factors were manipulated. First, the semantic category varied -- the accent bearing word was either a verb, an adjective, or a noun. Second, the intensity of the emphasis was varied (no, low, and high). The participants' task was to rate the intensity of the emphasis using a 7 point Likert scale (with a value of 1 indicating weak and 7 strong). Each of the 70 sentences were recorded from 8 Germans (4 male and 4 female), yielding a total of 560 trials.
Results and Conclusion:
Overall, the results show that people can produce and recognize different levels of accentuation. All "high" emphasis sentences were ranked as being more intense (5.2, on average) than the "low" emphasis sentences (4.1, on average). Both conditions were rated as more intense than the "no" emphasis sentences (1.9). Interestingly, "verb" sentences were rated as being more intense than either the "noun" or "adjective" sentences, which were remarkably similar. Critically, the pattern of intensity ratings was the same for each of the ten sentences strongly suggesting that the effect was solely due to the semantic role of the emphasized word. We are currently employing this framework to more closely examine the multimodal production and perception of emphatic speech.