Multiple Active Speaker Localization Based on Audio-visual Fusion in Two Stages

Li, Zhao; Herfet, Thorsten; Grochulla, Martin Peter; Thormählen, Thorsten

doi:10.1109/MFI.2012.6343015

Multiple Active Speaker Localization Based on Audio-visual Fusion in Two Stages

Li, Z., Herfet, T., Grochulla, M. P., & Thormählen, T. (2012). Multiple Active Speaker Localization Based on Audio-visual Fusion in Two Stages. In 2012 IEEE Conference on Multisensor Fusion and Integration for Intelligent Systems (pp. 262-268). Piscataway, NJ: IEEE. doi:10.1109/MFI.2012.6343015.

Item is 公開

表示: 全項目非表示: 全項目

基本情報

表示: 非表示:

アイテムのパーマリンク: https://hdl.handle.net/11858/00-001M-0000-0014-F319-D 版のパーマリンク: https://hdl.handle.net/11858/00-001M-0000-001A-30A6-6

資料種別: 会議論文

ファイル

表示: ファイル

作成者

表示:

非表示:

作成者:
Li, Zhao¹, 著者
Herfet, Thorsten¹, 著者
Grochulla, Martin Peter², 著者
Thormählen, Thorsten², 著者

所属:
1External Organizations, ou_persistent22
2Computer Graphics, MPI for Informatics, Max Planck Society, ou_40047

内容説明

表示:

非表示:

キーワード: -

要旨: Localization of multiple active speakers in natural environments with only two microphones is a challenging problem. Reverberation degrades performance of speaker localization based exclusively on directional cues. The audio modality alone has problems with localization accuracy while the video modality alone has problems with false speaker activity detections. This paper presents an approach based on audiovisual fusion in two stages. In the first stage, speaker activity is detected based on the audio-visual fusion which can handle false lip movements. In the second stage, a Gaussian fusion method is proposed to integrate the estimates of both modalities. As a consequence, the localization accuracy and robustness compared to the audio/video modality alone is significantly increased. Experimental results in various scenarios confirmed the improved performance of the proposed system.

資料詳細

表示:

非表示:

言語: eng - English

日付: オンライン出版: 2012出版: 2012

出版の状態: 出版

ページ: -

出版情報: -

目次: -

査読: -

識別子（DOI, ISBNなど）: DOI: 10.1109/MFI.2012.6343015
BibTex参照ID: Grochulla2012a
その他: Local-ID: BC1B873FD9C3D529C1257B0C00586BF0-Grochulla2012a

学位: -

訴訟

表示:

Project information

表示:

出版物 1

表示:

非表示:

出版物名: 2012 IEEE Conference on Multisensor Fusion and Integration for Intelligent Systems

省略形 : MFI 2012

種別: 会議論文集

著者・編者:

所属:

出版社, 出版地: Piscataway, NJ : IEEE

ページ: - 巻号: - 通巻号: - 開始・終了ページ: 262 - 268 識別子（ISBN, ISSN, DOIなど）: ISBN: 978-1-4673-2510-3
ISBN: 978-1-4673-2511-0

アイテム詳細

基本情報

ファイル

関連URL

作成者

内容説明

資料詳細

関連イベント

訴訟

Project information

出版物 1