Learnable Pooling Regions for Image Classification

Malinowski, Mateusz; Fritz, Mario

アイテム詳細

登録内容を編集ファイル形式で保存

一時保存へ追加

タグ情報を表示リリース履歴を表示詳細要約

公開

会議論文

Learnable Pooling Regions for Image Classification

MPS-Authors

/persons/resource/persons44976

Malinowski, Mateusz
Computer Vision and Multimodal Computing, MPI for Informatics, Max Planck Society;

/persons/resource/persons44451

Fritz, Mario
Computer Vision and Multimodal Computing, MPI for Informatics, Max Planck Society;

External Resource

There are no locators available

Fulltext (restricted access)

There are currently no full texts shared for your IP range.

フルテキスト (公開)

arXiv:1301.3516.pdf
(プレプリント), 424KB

付随資料 (公開)

There is no public supplementary material available

引用

Malinowski, M., & Fritz, M. (2013). Learnable Pooling Regions for Image Classification. In International Conference on Learning Representations Workshop Proceedings. Retrieved from http://arxiv.org/abs/1301.3516.

引用: https://hdl.handle.net/11858/00-001M-0000-0018-0B03-4

要旨

Biologically inspired, from the early HMAX model to Spatial Pyramid Matching, pooling has played an important role in visual recognition pipelines. Spatial pooling, by grouping of local codes, equips these methods with a certain degree of robustness to translation and deformation yet preserving important spatial information. Despite the predominance of this approach in current recognition systems, we have seen little progress to fully adapt the pooling strategy to the task at hand. This paper proposes a model for learning task dependent pooling scheme -- including previously proposed hand-crafted pooling schemes as a particular instantiation. In our work, we investigate the role of different regularization terms showing that the smooth regularization term is crucial to achieve strong performance using the presented architecture. Finally, we propose an efficient and parallel method to train the model. Our experiments show improved performance over hand-crafted pooling schemes on the CIFAR-10 and CIFAR-100 datasets -- in particular improving the state-of-the-art to 56.29% on the latter.