de.mpg.escidoc.pubman.appbase.FacesBean
Deutsch
 
Hilfe Wegweiser Impressum Kontakt Einloggen
  DetailsucheBrowse

Datensatz

DATENSATZ AKTIONENEXPORT

Freigegeben

Konferenzbeitrag

Learning Smooth Pooling Regions for Visual Recognition

MPG-Autoren
http://pubman.mpdl.mpg.de/cone/persons/resource/persons44976

Malinowski,  Mateusz
Computer Vision and Multimodal Computing, MPI for Informatics, Max Planck Society;

http://pubman.mpdl.mpg.de/cone/persons/resource/persons44451

Fritz,  Mario
Computer Vision and Multimodal Computing, MPI for Informatics, Max Planck Society;

Externe Ressourcen
Es sind keine Externen Ressourcen verfügbar
Volltexte (frei zugänglich)

paper0118.pdf
(beliebiger Volltext), 251KB

Ergänzendes Material (frei zugänglich)
Es sind keine frei zugänglichen Ergänzenden Materialien verfügbar
Zitation

Malinowski, M., & Fritz, M. (2013). Learning Smooth Pooling Regions for Visual Recognition. In T. Burghardt, D. Damen, W. Mayol-Cuevas, & M. Mirmehdi (Eds.), Electronic Proceedings of the British Machine Vision Conference 2013 (pp. 1-11). Durham: BMVA Press. doi:10.5244/C.27.118.


Zitierlink: http://hdl.handle.net/11858/00-001M-0000-0018-0C60-C
Zusammenfassung
From the early HMAX model to Spatial Pyramid Matching, spatial pooling has played an important role in visual recognition pipelines. By aggregating local statistics, it equips the recognition pipelines with a certain degree of robustness to translation and deformation yet preserving spatial information. Despite of its predominance in current recognition systems, we have seen little progress to fully adapt the pooling strategy to the task at hand. In this paper, we propose a flexible parameterization of the spatial pooling step and learn the pooling regions together with the classifier. We investigate a smoothness regularization term that in conjuncture with an efficient learning scheme makes learning scalable. Our framework can work with both popular pooling operators: sum-pooling and max-pooling. Finally, we show benefits of our approach for object recognition tasks based on visual words and higher level event recognition tasks based on object-bank features. In both cases, we improve over the hand-crafted spatial pooling step showing the importance of its adaptation to the task.