Improving automatic pedestrian detection by means of human perception

Vuong, QC; Castrillión-Santana, M

Item

ITEM ACTIONSEXPORT

Add to Basket

Local TagsRelease HistoryDetailsSummary

Released

Poster

Improving automatic pedestrian detection by means of human perception

MPS-Authors

/persons/resource/persons84291

Vuong, QC
Department Human Perception, Cognition and Action, Max Planck Institute for Biological Cybernetics, Max Planck Society;
Max Planck Institute for Biological Cybernetics, Max Planck Society;

External Resource

No external resources are shared

Fulltext (restricted access)

There are currently no full texts shared for your IP range.

Fulltext (public)

There are no public fulltexts stored in PuRe

Supplementary Material (public)

There is no public supplementary material available

Citation

Vuong, Q., & Castrillión-Santana, M. (2005). Improving automatic pedestrian detection by means of human perception. Poster presented at 8th Tübinger Wahrnehmungskonferenz (TWK 2005), Tübingen, Germany.

Cite as: https://hdl.handle.net/11858/00-001M-0000-0013-D645-7

Abstract

The present study investigates the detection of pedestrians by humans and by computer vision systems. This simple task is accomplished easily and quickly by human observers but still poses a challenge for current systems. In the computer vision community, different automatic detection systems have been designed using simple features to detect faces, heads and shoulders, full bodies and leg regions. With these regions, these systems perform fairly well but
still have high miss rates. However, we found a small correlation between the performance of these systems and human observers. This finding motivated us to systematically analyze human performance on a pedestrian detection task that tests whether these regions are the most semantically useful and whether other regions can also provide useful information. For that purpose, we used a psychophysical “bubbles” technique[1] to isolate those regions used by humans for detection. In this technique, images containing aligned pedestrian and non-pedestrian scenes are revealed through a mask of small randomly distributed gaussian windows (“bubbles”). Across observers, masks leading to correct “present” responses are summed and normalized to reveal image regions that were useful for detection. Our results indicate that observers relied
predominantly on head and leg regions, and to a lesser extent on arm regions. These results confirm some of the regions already considered by automatic pedestrian detectors. An important question is: Among those regions that are particularly discriminable are some more
critical than others for certain pattern matching problems? To address this question, the regions selected by human observers were applied to a general object detection framework designed by Viola and Jones[2]. This framework has already been successfully applied to different objects categories, and it is well known in the face detection community because it provides real-time performance. In sum, we believe that this perceptually-based approach can be useful for assigning different weights to regions and points not only based on their discriminability but also
on their perceptual significance for the problem considered.