RGBD Semantic Segmentation Using Spatio-Temporal Data-Driven Pooling

He, Yang; Chiu, Wei-Chen; Keuper, Margret; Fritz, Mario

RGBD Semantic Segmentation Using Spatio-Temporal Data-Driven Pooling

He, Y., Chiu, W.-C., Keuper, M., & Fritz, M. (2016). RGBD Semantic Segmentation Using Spatio-Temporal Data-Driven Pooling. Retrieved from http://arxiv.org/abs/1604.02388.

Item is 公開

表示: 全項目非表示: 全項目

基本情報

表示: 非表示:

アイテムのパーマリンク: https://hdl.handle.net/11858/00-001M-0000-002B-063C-5 版のパーマリンク: https://hdl.handle.net/11858/00-001M-0000-002B-063D-3

資料種別: 成果報告書

ファイル

表示: ファイル

非表示: ファイル

:

arXiv:1604.02388.pdf (プレプリント), 2MB

表示保存

ファイルのパーマリンク:
https://hdl.handle.net/11858/00-001M-0000-002B-063E-1

ファイル名:
arXiv:1604.02388.pdf

説明:
File downloaded from arXiv at 2016-07-15 12:26

OA-Status:

閲覧制限:
公開

MIMEタイプ / チェックサム:
application/pdf / [MD5]

技術的なメタデータ:

表示

著作権日付:
-

著作権情報:
-

CCライセンス:
http://arxiv.org/help/license

作成者

表示:

非表示:

作成者:
He, Yang¹, 著者
Chiu, Wei-Chen¹, 著者
Keuper, Margret¹, 著者
Fritz, Mario¹, 著者

所属:
1Computer Vision and Multimodal Computing, MPI for Informatics, Max Planck Society, ou_1116547

内容説明

表示:

非表示:

キーワード: Computer Science, Computer Vision and Pattern Recognition, cs.CV

要旨: Beyond the success in classification, neural networks have recently shown strong results on pixel-wise prediction tasks like image semantic segmentation on RGBD data. However, the commonly used deconvolutional layers for upsampling intermediate representations to the full-resolution output still show different failure modes, like imprecise segmentation boundaries and label mistakes in particular on large, weakly textured objects (e.g. fridge, whiteboard, door). We attribute these errors in part to the rigid way, current network aggregate information, that can be either too local (missing context) or too global (inaccurate boundaries). Therefore we propose a data-driven pooling layer that integrates with fully convolutional architectures and utilizes boundary detection from RGBD image segmentation approaches. We extend our approach to leverage region-level correspondences across images with an additional temporal pooling stage. We evaluate our approach on the NYU-Depth-V2 dataset comprised of indoor RGBD video sequences and compare it to various state-of-the-art baselines. Besides a general improvement over the state-of-the-art, our approach shows particularly good results in terms of accuracy of the predicted boundaries and in segmenting previously problematic classes.

資料詳細

表示:

非表示:

言語: eng - English

日付: 作成: 2016-04-08修正: 2016-06-09オンライン出版: 2016

出版の状態: オンラインで出版済み

ページ: 16 p.

出版情報: -

目次: -

査読: -

識別子（DOI, ISBNなど）: arXiv: 1604.02388
URI: http://arxiv.org/abs/1604.02388
BibTex参照ID: He_arXiv2016

学位: -

アイテム詳細

基本情報

ファイル

関連URL

作成者

内容説明

資料詳細

関連イベント

訴訟

Project information

出版物