Learning the Influence of Spatio-Temporal Variations in Local Image Structure 
on Visual Saliency

Kienzle, W; Wichmann, FA; Schölkopf, B; Franz, MO

Item

ITEM ACTIONSEXPORT

Add to Basket

Local TagsRelease HistoryDetailsSummary

Released

Poster

Learning the Influence of Spatio-Temporal Variations in Local Image Structure on Visual Saliency

MPS-Authors

/persons/resource/persons84012

Kienzle, W
Department Empirical Inference, Max Planck Institute for Biological Cybernetics, Max Planck Society;
Max Planck Institute for Biological Cybernetics, Max Planck Society;

/persons/resource/persons84314

Wichmann, FA
Department Empirical Inference, Max Planck Institute for Biological Cybernetics, Max Planck Society;
Max Planck Institute for Biological Cybernetics, Max Planck Society;

/persons/resource/persons84193

Schölkopf, B
Department Empirical Inference, Max Planck Institute for Biological Cybernetics, Max Planck Society;
Max Planck Institute for Biological Cybernetics, Max Planck Society;

/persons/resource/persons83919

Franz, MO
Department Empirical Inference, Max Planck Institute for Biological Cybernetics, Max Planck Society;
Max Planck Institute for Biological Cybernetics, Max Planck Society;

External Resource

http://www.cyberneum.de/fileadmin/user_upload/files/publications/TWK-2007.pdf
(Publisher version)

Fulltext (restricted access)

There are currently no full texts shared for your IP range.

Fulltext (public)

There are no public fulltexts stored in PuRe

Supplementary Material (public)

There is no public supplementary material available

Citation

Kienzle, W., Wichmann, F., Schölkopf, B., & Franz, M. (2007). Learning the Influence of Spatio-Temporal Variations in Local Image Structure on Visual Saliency. Poster presented at 10th Tübinger Wahrnehmungskonferenz (TWK 2007), Tübingen, Germany.

Cite as: https://hdl.handle.net/11858/00-001M-0000-0013-CCE9-A

Abstract

Computational models for bottom-up visual attention traditionally consist of a bank of Gabor-like or Difference-of-Gaussians filters and a nonlinear combination scheme which combines the filter responses into a real-valued saliency measure [1]. Recently it was shown that a standard machine learning algorithm can be used to derive a saliency model from human eye movement data with a very small number of additional assumptions. The learned model is much simpler than previous models, but nevertheless has state-of-the-art prediction performance [2]. A central result from this study is that DoG-like center-surround filters emerge as the unique solution to optimizing the predictivity of the model.
Here we extend the learning method to the temporal domain. While the previous model [2] predicts visual saliency based on local pixel intensities in a static image, our model also takes into account temporal intensity variations. We find that the learned model responds strongly to temporal intensity changes ocurring 200-250ms before a saccade is initiated. This delay coincides with the typical saccadic latencies, indicating that the learning algorithm has extracted a meaningful statistic from the training data. In addition, we show that the model correctly predicts a significant proportion of human eye movements on previously unseen test data.