Hand Shape Recognition Using a ToF Camera : An Application to Sign Language

Simonovsky, Martin

Datensatz

DATENSATZ AKTIONENEXPORT

Zur Ablage hinzufügen

Lokale TagsFreigabegeschichteDetailsÜbersicht

Freigegeben

Hochschulschrift

Hand Shape Recognition Using a ToF Camera : An Application to Sign Language

MPG-Autoren

/persons/resource/persons45485

Simonovsky, Martin
International Max Planck Research School, MPI for Informatics, Max Planck Society;
Computer Graphics, MPI for Informatics, Max Planck Society;

Externe Ressourcen

Es sind keine externen Ressourcen hinterlegt

Volltexte (beschränkter Zugriff)

Für Ihren IP-Bereich sind aktuell keine Volltexte freigegeben.

Volltexte (frei zugänglich)

Es sind keine frei zugänglichen Volltexte in PuRe verfügbar

Ergänzendes Material (frei zugänglich)

Es sind keine frei zugänglichen Ergänzenden Materialien verfügbar

Zitation

Simonovsky, M. (2011). Hand Shape Recognition Using a ToF Camera: An Application to Sign Language. Master Thesis, Universität des Saarlandes, Saarbrücken.

Zitierlink: https://hdl.handle.net/11858/00-001M-0000-0010-11B9-2

Zusammenfassung

This master's thesis investigates the benefit of utilizing depth information acquired by a time-of-flight (ToF) camera for hand shape recognition from unrestricted viewpoints. Specifically, we assess the hypothesis that classical 3D content descriptors might be inappropriate for ToF depth images due to the 2.5D nature and noisiness of the data and possible expensive computations in 3D space. Instead, we extend 2D descriptors to make use of the additional semantics of depth images. Our system is based on the appearance-based retrieval paradigm, using a synthetic 3D hand model to generate its database. The system is able to run at interactive frame rates. For increased robustness, no color, intensity, or time coherence information is used. A novel, domain-specific algorithm for segmenting the forearm from the upper body based on reprojecting the acquired geometry into the lateral view is introduced. Moreover, three kinds of descriptors exploiting depth data are proposed and the made design choices are experimentally supported. The whole system is then evaluated on an American sign language fingerspelling dataset. However, the retrieval performance still leaves room for improvements. Several insights and possible reasons are discussed.