Mo2Cap2: Real-time Mobile 3D Motion Capture with a Cap-mounted Fisheye Camera

Xu, Weipeng; Chatterjee, Avishek; Zollhöfer, Michael; Rhodin, Helge; Fua, Pascal; Seidel, Hans-Peter; Theobalt, Christian

Item

ITEM ACTIONSEXPORT

Add to Basket

Please note that a newer version of this item is available:
https://pure.mpg.de/pubman/item/item_2583634_3

DetailsSummary

Released

Paper

Mo2Cap2: Real-time Mobile 3D Motion Capture with a Cap-mounted Fisheye Camera

MPS-Authors

/persons/resource/persons206382

Xu, Weipeng
Computer Graphics, MPI for Informatics, Max Planck Society;

/persons/resource/persons211205

Chatterjee, Avishek
Computer Graphics, MPI for Informatics, Max Planck Society;

/persons/resource/persons136490

Zollhöfer, Michael
Computer Graphics, MPI for Informatics, Max Planck Society;

/persons/resource/persons45449

Seidel, Hans-Peter
Computer Graphics, MPI for Informatics, Max Planck Society;

/persons/resource/persons45610

Theobalt, Christian
Computer Graphics, MPI for Informatics, Max Planck Society;

External Resource

No external resources are shared

Fulltext (restricted access)

There are currently no full texts shared for your IP range.

Fulltext (public)

arXiv:1803.05959.pdf
(Preprint), 7MB

Supplementary Material (public)

There is no public supplementary material available

Citation

Xu, W., Chatterjee, A., Zollhöfer, M., Rhodin, H., Fua, P., Seidel, H.-P., et al. (2018). Mo2Cap2: Real-time Mobile 3D Motion Capture with a Cap-mounted Fisheye Camera. Retrieved from http://arxiv.org/abs/1803.05959.

Cite as: https://hdl.handle.net/21.11116/0000-0001-3C65-B

Abstract

We propose the first real-time approach for the egocentric estimation of 3D human body pose in a wide range of unconstrained everyday activities. This setting has a unique set of challenges, such as mobility of the hardware setup, and robustness to long capture sessions with fast recovery from tracking failures. We tackle these challenges based on a novel lightweight setup that converts a standard baseball cap to a device for high-quality pose estimation based on a single cap-mounted fisheye camera. From the captured egocentric live stream, our CNN based 3D pose estimation approach runs at 60Hz on a consumer-level GPU. In addition to the novel hardware setup, our other main contributions are: 1) a large ground truth training corpus of top-down fisheye images and 2) a novel disentangled 3D pose estimation approach that takes the unique properties of the egocentric viewpoint into account. As shown by our evaluation, we achieve lower 3D joint error as well as better 2D overlay than the existing baselines.