hide
Free keywords:
Computer Science, Computer Vision and Pattern Recognition, cs.CV,Computer Science, Learning, cs.LG
Abstract:
Biologically inspired, from the early HMAX model to Spatial Pyramid Matching,
pooling has played an important role in visual recognition pipelines. Spatial
pooling, by grouping of local codes, equips these methods with a certain degree
of robustness to translation and deformation yet preserving important spatial
information. Despite the predominance of this approach in current recognition
systems, we have seen little progress to fully adapt the pooling strategy to
the task at hand. This paper proposes a model for learning task dependent
pooling scheme -- including previously proposed hand-crafted pooling schemes as
a particular instantiation. In our work, we investigate the role of different
regularization terms showing that the smooth regularization term is crucial to
achieve strong performance using the presented architecture. Finally, we
propose an efficient and parallel method to train the model. Our experiments
show improved performance over hand-crafted pooling schemes on the CIFAR-10 and
CIFAR-100 datasets -- in particular improving the state-of-the-art to 56.29% on
the latter.