de.mpg.escidoc.pubman.appbase.FacesBean
English
 
Help Guide Disclaimer Contact us Login
  Advanced SearchBrowse

Item

ITEM ACTIONSEXPORT

Released

Paper

Articulated Multi-person Tracking in the Wild

MPS-Authors
http://pubman.mpdl.mpg.de/cone/persons/resource/persons188120

Insafutdinov,  Eldar
Computer Vision and Multimodal Computing, MPI for Informatics, Max Planck Society;

http://pubman.mpdl.mpg.de/cone/persons/resource/persons71836

Andriluka,  Mykhaylo
Computer Vision and Multimodal Computing, MPI for Informatics, Max Planck Society;

http://pubman.mpdl.mpg.de/cone/persons/resource/persons45196

Pishchulin,  Leonid
Computer Vision and Multimodal Computing, MPI for Informatics, Max Planck Society;

http://pubman.mpdl.mpg.de/cone/persons/resource/persons86684

Tang,  Siyu
Computer Vision and Multimodal Computing, MPI for Informatics, Max Planck Society;

http://pubman.mpdl.mpg.de/cone/persons/resource/persons44919

Levinkov,  Evgeny
Computer Vision and Multimodal Computing, MPI for Informatics, Max Planck Society;

http://pubman.mpdl.mpg.de/cone/persons/resource/persons98382

Andres,  Bjoern
Computer Vision and Multimodal Computing, MPI for Informatics, Max Planck Society;

http://pubman.mpdl.mpg.de/cone/persons/resource/persons45383

Schiele,  Bernt
Computer Vision and Multimodal Computing, MPI for Informatics, Max Planck Society;

Locator
There are no locators available
Fulltext (public)

arXiv:1612.01465.pdf
(Preprint), 14MB

Supplementary Material (public)
There is no public supplementary material available
Citation

Insafutdinov, E., Andriluka, M., Pishchulin, L., Tang, S., Levinkov, E., Andres, B., et al. (2016). Articulated Multi-person Tracking in the Wild. Retrieved from http://arxiv.org/abs/1612.01465.


Cite as: http://hdl.handle.net/11858/00-001M-0000-002C-1867-1
Abstract
In this paper we propose an approach for articulated tracking of multiple people in unconstrained videos. Our starting point is a model that resembles existing architectures for single-frame pose estimation but is several orders of magnitude faster. We achieve this in two ways: (1) by simplifying and sparsifying the body-part relationship graph and leveraging recent methods for faster inference, and (2) by offloading a substantial share of computation onto a feed-forward convolutional architecture that is able to detect and associate body joints of the same person even in clutter. We use this model to generate proposals for body joint locations and formulate articulated tracking as spatio-temporal grouping of such proposals. This allows to jointly solve the association problem for all people in the scene by propagating evidence from strong detections through time and enforcing constraints that each proposal can be assigned to one person only. We report results on a public MPII Human Pose benchmark and on a new dataset of videos with multiple people. We demonstrate that our model achieves state-of-the-art results while using only a fraction of time and is able to leverage temporal information to improve state-of-the-art for crowded scenes.