de.mpg.escidoc.pubman.appbase.FacesBean
English
 
Help Guide Disclaimer Contact us Login
  Advanced SearchBrowse

Item

ITEM ACTIONSEXPORT

Released

Report

View-based Models of 3D Object Recognition and Class-specific Invariances

MPS-Authors
http://pubman.mpdl.mpg.de/cone/persons/resource/persons84063

Logothetis,  NK
Department Physiology of Cognitive Processes, Max Planck Institute for Biological Cybernetics, Max Planck Society;

http://pubman.mpdl.mpg.de/cone/persons/resource/persons84280

Vetter,  T
Department Human Perception, Cognition and Action, Max Planck Institute for Biological Cybernetics, Max Planck Society;

Locator
There are no locators available
Fulltext (public)
There are no public fulltexts available
Supplementary Material (public)
There is no public supplementary material available
Citation

Logothetis, N., Vetter, T., Hurlbert, A., & Poggio, T.(1994). View-based Models of 3D Object Recognition and Class-specific Invariances (AIM-1472).


Cite as: http://hdl.handle.net/11858/00-001M-0000-0013-ED3E-6
Abstract
This paper describes the main features of a view-based model of object recognition. The model tries to capture general properties to be expected in a biological architecture for object recognition. The basic module is a regularization network in which each of the hidden units is broadly tuned to a specic view of the object to be recognized. The network output, which may be largely view independent, is first described in terms of some simple simulations. The following renements and details of the basic module are then discussed: (1) some of the units may represent only components of views of the object the optimal stimulus for the unit, its \center", is eectively a complex feature; (2) the units' properties are consistent with the usual description of cortical neurons as tuned to multidimensional optimal stimuli; (3) in learning to recognize new objects, preexisting centers may be used and modied, but also new centers may be created incrementally so as to provide maximal invariance; (4) modules are part of a hierarchical structure: the output of a network may be used as one of the inputs to another, in this way synthesizing increasingly complex features and templates; (5) in several recognition tasks, in particular at the basic level, a single center using view-invariant features may be sufficient. Modules of this type can deal with recognition of specic objects, for instance a specic face under various transformations such as those due to viewpoint and illumination, provided that a sufficient number of example views of the specic object are available. An architecture for 3D object recognition, however, must cope to some extent even when only a single model view is given. The main contribution of this paper is an outline of a recognition architecture that deals with objects of a nice class undergoing a broad spectrum of transformations due to illumination, pose, expression and so on by exploiting prototypical examples. A nice class of objects is a set of objects with sufficiently similar transformation properties under specic transformations, such as viewpoint transformations. For nice object classes, we discuss two possibilities: (a) class-specic transformations are to be applied to a single model image to generate additional virtual example views, thus allowing some degree of generalization beyond what a single model view could otherwise provide; (b) class specic, view-invariant features are learned from examples of the class and used with the novel model image, without an explicit generation of virtual examples.