Deutsch
 
Hilfe Datenschutzhinweis Impressum
  DetailsucheBrowse

Datensatz

DATENSATZ AKTIONENEXPORT

Freigegeben

Konferenzbeitrag

Learning Complex Motions by Sequencing Simpler Motion Templates

MPG-Autoren
/persons/resource/persons84135

Peters,  J
Department Empirical Inference, Max Planck Institute for Biological Cybernetics, Max Planck Society;
Dept. Empirical Inference, Max Planck Institute for Intelligent Systems, Max Planck Society;

Externe Ressourcen
Volltexte (beschränkter Zugriff)
Für Ihren IP-Bereich sind aktuell keine Volltexte freigegeben.
Volltexte (frei zugänglich)
Es sind keine frei zugänglichen Volltexte in PuRe verfügbar
Ergänzendes Material (frei zugänglich)
Es sind keine frei zugänglichen Ergänzenden Materialien verfügbar
Zitation

Neumann, G., Maass, W., & Peters, J. (2009). Learning Complex Motions by Sequencing Simpler Motion Templates. In A. Danyluk, L. Bottou, & M. Littman (Eds.), ICML '09: Proceedings of the 26th Annual International Conference on Machine Learning (pp. 753-760). New York, NY, USA: ACM Press.


Zitierlink: https://hdl.handle.net/11858/00-001M-0000-0013-C487-8
Zusammenfassung
Abstraction of complex, longer motor tasks
into simpler elemental movements enables
humans and animals to exhibit motor skills
which have not yet been matched by robots.
Humans intuitively decompose complex motions
into smaller, simpler segments. For
example when describing simple movements
like drawing a triangle with a pen, we can
easily name the basic steps of this movement.
Surprisingly, such abstractions have rarely
been used in artificial motor skill learning algorithms.
These algorithms typically choose
a new action (such as a torque or a force) at a
very fast time-scale. As a result, both policy
and temporal credit assignment problem become
unnecessarily complex - often beyond
the reach of current machine learning methods.
We introduce a new framework for temporal
abstractions in reinforcement learning (RL),
i.e. RL with motion templates. We present a
new algorithm for this framework which can
learn high-quality policies by making only
few abstract decisions.