日本語
 
Help Privacy Policy ポリシー/免責事項
  詳細検索ブラウズ

アイテム詳細

  Seeing with Humans: Gaze-Assisted Neural Image Captioning

Sugano, Y., & Bulling, A. (2016). Seeing with Humans: Gaze-Assisted Neural Image Captioning. Retrieved from http://arxiv.org/abs/1608.05203.

Item is

基本情報

表示: 非表示:
資料種別: 成果報告書
LaTeX : Seeing with Humans: {G}aze-Assisted Neural Image Captioning

ファイル

表示: ファイル
非表示: ファイル
:
arXiv:1608.05203.pdf (プレプリント), 3MB
ファイルのパーマリンク:
https://hdl.handle.net/11858/00-001M-0000-002B-AC69-D
ファイル名:
arXiv:1608.05203.pdf
説明:
File downloaded from arXiv at 2016-10-28 11:29
OA-Status:
閲覧制限:
公開
MIMEタイプ / チェックサム:
application/pdf / [MD5]
技術的なメタデータ:
著作権日付:
-
著作権情報:
-
CCライセンス:
http://arxiv.org/help/license

関連URL

表示:

作成者

表示:
非表示:
 作成者:
Sugano, Yusuke1, 著者           
Bulling, Andreas1, 著者           
所属:
1Computer Vision and Multimodal Computing, MPI for Informatics, Max Planck Society, ou_1116547              

内容説明

表示:
非表示:
キーワード: Computer Science, Computer Vision and Pattern Recognition, cs.CV
 要旨: Gaze reflects how humans process visual scenes and is therefore increasingly used in computer vision systems. Previous works demonstrated the potential of gaze for object-centric tasks, such as object localization and recognition, but it remains unclear if gaze can also be beneficial for scene-centric tasks, such as image captioning. We present a new perspective on gaze-assisted image captioning by studying the interplay between human gaze and the attention mechanism of deep neural networks. Using a public large-scale gaze dataset, we first assess the relationship between state-of-the-art object and scene recognition models, bottom-up visual saliency, and human gaze. We then propose a novel split attention model for image captioning. Our model integrates human gaze information into an attention-based long short-term memory architecture, and allows the algorithm to allocate attention selectively to both fixated and non-fixated image regions. Through evaluation on the COCO/SALICON datasets we show that our method improves image captioning performance and that gaze can complement machine attention for semantic scene understanding tasks.

資料詳細

表示:
非表示:
言語: eng - English
 日付: 2016-08-182016
 出版の状態: オンラインで出版済み
 ページ: 8 p.
 出版情報: -
 目次: -
 査読: -
 識別子(DOI, ISBNなど): arXiv: 1608.05203
URI: http://arxiv.org/abs/1608.05203
BibTex参照ID: Sugano1608.05203
 学位: -

関連イベント

表示:

訴訟

表示:

Project information

表示:

出版物

表示: