ausblenden:
Schlagwörter:
-
Zusammenfassung:
In which way do the local image statistics at the center of gaze differ from those at randomly chosen image
locations? In 1999, Reinagel and Zador [1] showed that RMS contrast is significantly increased around
fixated locations in natural images. Since then, numerous additional hypotheses have been proposed, based
on edge content, entropy, self-information, higher-order statistics, or sophisticated models such as that of
Itti and Koch [2].
While these models are rather different in terms of the used image features, they hardly differ in terms
of their predictive power. This complicates the question of which bottom-up mechanism actually drives
human eye movements. To shed some light on this problem, we analyze the nonlinear receptive fields of
an eye movement model which is purely data-driven. It consists of a nonparametric radial basis function
network, fitted to human eye movement data. To avoid a bias towards specific image features such as
edges or corners, we deliberately chose raw pixel values as the input to our model, not the outputs of
some filter bank. The learned model is analyzed by computing its optimal stimuli. It turns our that there
are two maximally excitatory stimuli, both of which have center-surround structure, and two maximally
inhibitory stimuli which are basically flat. We argue that these can be seen as nonlinear receptive fields of
the underlying system. In particular, we show that a small radial basis function network with the optimal
stimuli as centers predicts unseen eye movements as precisely as the full model.
The fact that center-surround filters emerge from a simple optimality criterion—without any prior assumption
that would make them more probable than e.g. edges, corners, or any other configuration of pixels
values in a square patch—suggests a special role of these filters in free-viewing of natural images.