de.mpg.escidoc.pubman.appbase.FacesBean
Deutsch
 
Hilfe Wegweiser Impressum Kontakt Einloggen
  DetailsucheBrowse

Datensatz

DATENSATZ AKTIONENEXPORT

Freigegeben

Poster

Audiovisual synchrony detection for speech and music signals

MPG-Autoren
http://pubman.mpdl.mpg.de/cone/persons/resource/persons84042

Lee,  H
Research Group Cognitive Neuroimaging, Max Planck Institute for Biological Cybernetics, Max Planck Society;

http://pubman.mpdl.mpg.de/cone/persons/resource/persons84112

Noppeney,  U
Research Group Cognitive Neuroimaging, Max Planck Institute for Biological Cybernetics, Max Planck Society;

Externe Ressourcen
Es sind keine Externen Ressourcen verfügbar
Volltexte (frei zugänglich)
Es sind keine frei zugänglichen Volltexte verfügbar
Ergänzendes Material (frei zugänglich)
Es sind keine frei zugänglichen Ergänzenden Materialien verfügbar
Zitation

Lee, H., & Noppeney, U. (2009). Audiovisual synchrony detection for speech and music signals. Poster presented at 10th International Multisensory Research Forum (IMRF 2009), New York, NY, USA.


Zitierlink: http://hdl.handle.net/11858/00-001M-0000-0013-C403-1
Zusammenfassung
Introduction: Audiovisual integration crucially depends on the relative timing of the auditory and visual signals. Although multisensory signals do not have to be precisely physically synchronous in order to be perceived as single temporal events, they have to co-occur within a certain temporal window of integration. To investigate how the human brain is fine tuned to the natural temporal statistics of audiovisual signals, we characterized the temporal integration window for natural speech, sinewave replicas of natural speech (SWS) and music in a simultaneity judgment task. Methods: The experimental paradigm manipulated: 1) stimulus class: speech vs. SWS vs. music, and 2) stimulus length: short (i.e. natural syllables, SWS syllables and tones) vs. long (i.e. natural sentences, SWS sentences and melodies). Audiovisual asynchronies ranged from -360ms (auditory leading) to 360 ms (visual leading) in 60ms increments. Eight participants performed the experiment on 2 separate days. The order of conditions was counterbalanced within and between subjects. The proportion of synchronous responses was computed for each participant. To refrain from making any distributional assumptions, the psychometric curves of each participant were characterized by four indices: (i) peak performance, (ii) peak location, (iii) width and (iv) asymmetry [1]. The four indices were analyzed using repeated measures of ANOVAs with stimulus class and stimulus length as within-subjects factors. Results: The ANOVA for peak performance did not show any significant main effects of stimulus class and length [F(2,14)<1, n.s.; F(1,7)=1.6, p=.24]. The ANOVA for peak location revealed a significant interaction between stimulus class and length [F(2,14)=3.8, p<.05]. Post-hoc paired t-tests revealed that the peak locations were significantly shifted towards auditory leading for melodies compared to tones [t(7)=2.4, p<.05], and for melodies compared to SWS sentences [t(7)=-2.3, p=.053]. The ANOVA for width revealed significant main effects of stimulus class and length [F(2,14)=9.3, p<.005; F(1,7)=11.0, p<.05] in the absence of an interaction [F(2,14)<1, n.s.]. Post-hoc paired t-tests revealed that the widths were wider for SWS speech than natural speech [t(7)=7.0, p<.005] and music [t(7)=2.4, p=.05]. Furthermore, the widths were narrower for long stimuli (i.e. sentences and melodies) than short stimuli (i.e. syllables and tones) [t(7)=-3.3, p<.05]. With respect to the asymmetry, there was a significant main effect of stimulus length [F(1,7)=7.1, p<.05] but not stimulus class [F(2,14)=1.1, p=.35], thus indicating that the psychometric curves were more asymmetric for long stimuli (i.e. sentences and melodies) than short stimuli (i.e. syllables and tones). Conclusion: Our results demonstrated that the psychometric curves were narrower and more asymmetric for long stimuli (i.e. sentences and melodies) than short stimuli (i.e. syllables and tones). Thus, participants may rely on information during the entire sentence for synchrony judgments. In addition, our results demonstrated that the psychometric curves were wider but less asymmetric for SWS speech relative to natural speech and music. Collectively, our results support the hypothesis that audiovisual speech perception is fine-tuned to the natural mapping between facial movement and spectrotemporal structure of natural speech.