de.mpg.escidoc.pubman.appbase.FacesBean
Deutsch
 
Hilfe Wegweiser Impressum Kontakt Einloggen
  DetailsucheBrowse

Datensatz

DATENSATZ AKTIONENEXPORT

Freigegeben

Poster

The human brain as a large margin classifier

MPG-Autoren
http://pubman.mpdl.mpg.de/cone/persons/resource/persons83943

Graf,  ABA
Department Empirical Inference, Max Planck Institute for Biological Cybernetics, Max Planck Society;
Department Human Perception, Cognition and Action, Max Planck Institute for Biological Cybernetics, Max Planck Society;
Department Empirical Inference, Max Planck Institute for Biological Cybernetics, Max Planck Society;
Department Human Perception, Cognition and Action, Max Planck Institute for Biological Cybernetics, Max Planck Society;

http://pubman.mpdl.mpg.de/cone/persons/resource/persons84314

Wichmann,  F
Department Empirical Inference, Max Planck Institute for Biological Cybernetics, Max Planck Society;

http://pubman.mpdl.mpg.de/cone/persons/resource/persons83839

Bülthoff,  H
Department Human Perception, Cognition and Action, Max Planck Institute for Biological Cybernetics, Max Planck Society;

http://pubman.mpdl.mpg.de/cone/persons/resource/persons84193

Schölkopf,  B
Department Empirical Inference, Max Planck Institute for Biological Cybernetics, Max Planck Society;

Externe Ressourcen
Es sind keine Externen Ressourcen verfügbar
Volltexte (frei zugänglich)
Es sind keine frei zugänglichen Volltexte verfügbar
Ergänzendes Material (frei zugänglich)
Es sind keine frei zugänglichen Ergänzenden Materialien verfügbar
Zitation

Graf, A., Wichmann, F., Bülthoff, H., & Schölkopf, B. (2005). The human brain as a large margin classifier. Poster presented at Computational and Systems Neuroscience Meeting (COSYNE 2005), Salt Lake City, UT, USA.


Zitierlink: http://hdl.handle.net/11858/00-001M-0000-0013-D5FF-F
Zusammenfassung
We investigate one critical aspect of intelligence: categorization. For this we consider a visual classification experiment where human subjects perform a gender discrimination task of realistic images of human faces. Although considerable technological advances have allowed us to gain novel insights into the visual neurosciences, our methods and understanding are still limited on the algorithmic level. Recently there has been significant progress in learning theory and we propose to use state-of-the-art algorithms derived from machine learning theory to model the human brain. These algorithms are shown to be complex enough to capture some essentials of the human behavior while remaining amenable to close analysis, thus allowing us to make quantitative predictions about human behavior based on the properties of the machine. In this way we hope to reach a better understanding of the algorithms used by animate systems for face processing and classification. In a gender classification experiment 55 human subjects were sequentially presented with a gender-balanced random subset of 152 from a total of 200 images of laser-scanned human faces. The subjects' gender estimate was recorded for each presented stimulus. The average of the subjects' responses to each stimulus is represented by the set P defined by the mean probability that the subjects classified a given face as male. The same classification process is then modeled algorithmically using machine learning methods. In the machine learning paradigm the stimuli are first presented to a feature extractor which delivers a set of encoding vectors. The image grayscale intensity matrix is converted for each stimuli into an image vector, yielding a data matrix. This matrix is then fed into the feature extractor which in turn yields an encoding matrix containing the encoding vectors which are not only of reduced dimensionality, but also represent all the information about the feature extraction process. To model feature extraction we first consider some benchmark models from artificial computer vision: the image size reduction (SR) and histogram (H) algorithms, and the empirical kernel map (KM: nonlinear decomposition of the Gram matrix of the data). Second, we consider linear feature extractors such as Principal Component Analysis (PCA: decomposition according to variance in the data), Independent Component Analysis (ICA: maximization of independence of the data) and Non-negative Matrix Factorization (NMF: decomposition of the data using only non-negative values). Finally, we consider Gabor wavelet filters (G) which are a benchmark model in biologically-inspired computer vision. Each subject's dataset is defined by the combination of the encodings corresponding to the stimuli seen by the subject with the corresponding gender labels. For each of the subject datasets, classification is modeled using a separating hyperplane (SH) between both classes computed in various manners. The Support Vector Machine (SVM) maximizes the margin separating both classes while minimizing the quantity of wrongly-classified patterns and patterns in the margin stripe. The Relevance Vector Machine (RVM) classifies patterns by maximizing the conditional probability of class membership with respect to some hyperparameter. Common classifiers in the neurosciences are variants of the mean-of-class Prototype classifier (Prot) which classifies according to the closest mean-of-class prototype. This concept is extended by the Kmeans classifier (Kmean) which considers multiple prototypes in each class. To relate the responses of humans and machines, we compute the distance of each stimulus seen by the subject to the SH for each subject dataset, encoding and classifier. By averaging these distances across all subjects, we obtain the set D of distances representing the responses to each stimulus of the both the feature extraction and the classification algorithms. The set D is the machine learning ``pendant'' of the subjects' responses P to each stimulus. The analysis of the function P(D) then provides insight into the algorithms which model best the gender classification of faces by humans. In particular we compute the ``averaged'' psychometric function relating D to P and assess the goodness-of-fit of this interpolation. We then see that the H and the SR feature extractors are least human-like. The G is best in this respect, which allows us to re-obtain a benchmark result from psychophysics and neurophysiology. The linear feature extractors represent an intermediate case: the part-based NMF, where the basis vectors have local features highlighted, compares better to humans than the holistic PCA and ICA, which have basis vectors representing the complete contours of the faces. The SVM yields a decision metric closest to that of humans for all feature extractors and classifiers, while the prototype classifier is worst for all feature extractors considered. In summary, the superiority of the Gabor filter model for extraction of visual features comes as no surprise since Gabor filters have been shown to be biologically plausible both in psychophysical and neurophysiological studies. However, a novel result is that the SVM, and not the popular prototype classifier, best explains the human data at the classification stage. In our research we have studied how well different machines globally mimic the subjects' classification behavior. We find that some machines capture more than just the input-output (classification error) mapping of the human subjects but instead captured some aspects of the human internal representation of faces. The distance of a face to the SH reflects the classification difficulty for the G feature extractor and SVM classifier exceedingly well and thus allows us to relate the responses of humans and machines. Machine learning can thus be successfully used to understand human classification behavior quantitatively and to bridge the gap between human psychophysics and theoretical modeling. It is a novel and efficient algorithmic tool which is well-founded theoretically. We hope to have demonstrated the usefulness of our approach, and would like to see its application to other fields of neurosciences like electrophysiology or imaging.