Articulated Multi-person Tracking in the Wild

Insafutdinov, Eldar; Andriluka, Mykhaylo; Pishchulin, Leonid; Tang, Siyu; Levinkov, Evgeny; Andres, Bjoern; Schiele, Bernt

DetailsSummary

Articulated Multi-person Tracking in the Wild

Insafutdinov, E., Andriluka, M., Pishchulin, L., Tang, S., Levinkov, E., Andres, B., et al. (2016). Articulated Multi-person Tracking in the Wild. Retrieved from http://arxiv.org/abs/1612.01465.

Item is Released

show all hide all

Basic

show hide

Item Permalink: https://hdl.handle.net/11858/00-001M-0000-002C-1867-1 Version Permalink: https://hdl.handle.net/11858/00-001M-0000-002C-195A-8

Genre: Paper

Files

show Files

hide Files

:

arXiv:1612.01465.pdf (Preprint), 14MB

View Save

File Permalink:
https://hdl.handle.net/11858/00-001M-0000-002C-1869-E

Name:
arXiv:1612.01465.pdf

Description:
File downloaded from arXiv at 2016-12-07 13:03

OA-Status:

Visibility:
Public

MIME-Type / Checksum:
application/pdf / [MD5]

Technical Metadata:

View

Copyright Date:
-

Copyright Info:
-

License:
http://arxiv.org/help/license

Locators

show

Creators

show

hide

Creators:
Insafutdinov, Eldar¹, Author
Andriluka, Mykhaylo¹, Author
Pishchulin, Leonid¹, Author
Tang, Siyu¹, Author
Levinkov, Evgeny¹, Author
Andres, Bjoern¹, Author
Schiele, Bernt¹, Author

Affiliations:
1Computer Vision and Multimodal Computing, MPI for Informatics, Max Planck Society, ou_1116547

Content

show

hide

Free keywords: Computer Science, Computer Vision and Pattern Recognition, cs.CV

Abstract: In this paper we propose an approach for articulated tracking of multiple people in unconstrained videos. Our starting point is a model that resembles existing architectures for single-frame pose estimation but is several orders of magnitude faster. We achieve this in two ways: (1) by simplifying and sparsifying the body-part relationship graph and leveraging recent methods for faster inference, and (2) by offloading a substantial share of computation onto a feed-forward convolutional architecture that is able to detect and associate body joints of the same person even in clutter. We use this model to generate proposals for body joint locations and formulate articulated tracking as spatio-temporal grouping of such proposals. This allows to jointly solve the association problem for all people in the scene by propagating evidence from strong detections through time and enforcing constraints that each proposal can be assigned to one person only. We report results on a public MPII Human Pose benchmark and on a new dataset of videos with multiple people. We demonstrate that our model achieves state-of-the-art results while using only a fraction of time and is able to leverage temporal information to improve state-of-the-art for crowded scenes.

Details

show

hide

Language(s): eng - English

Dates: Created: 2016-12-05Published Online: 2016

Publication Status: Published online

Pages: 12 p.

Publishing info: -

Table of Contents: -

Rev. Type: -

Identifiers: arXiv: 1612.01465
URI: http://arxiv.org/abs/1612.01465
BibTex Citekey: insafutdinov2016b

Degree: -

Event

show

Legal Case

show

Project information

show

Source

show