Reinforcement learning to adjust parametrized motor primitives to new situations

Kober, J; Wilhelm, A; Oztop, E; Peters, J

doi:10.1007/s10514-012-9290-3

Datensatz

DATENSATZ AKTIONENEXPORT

Zur Ablage hinzufügen

Lokale TagsFreigabegeschichteDetailsÜbersicht

Freigegeben

Zeitschriftenartikel

Reinforcement learning to adjust parametrized motor primitives to new situations

MPG-Autoren

/persons/resource/persons84021

Kober, J
Dept. Empirical Inference, Max Planck Institute for Intelligent Systems, Max Planck Society;

/persons/resource/persons84135

Peters, J
Dept. Empirical Inference, Max Planck Institute for Intelligent Systems, Max Planck Society;

Externe Ressourcen

https://link.springer.com/content/pdf/10.1007%2Fs10514-012-9290-3.pdf
(Verlagsversion)

Volltexte (beschränkter Zugriff)

Für Ihren IP-Bereich sind aktuell keine Volltexte freigegeben.

Volltexte (frei zugänglich)

Es sind keine frei zugänglichen Volltexte in PuRe verfügbar

Ergänzendes Material (frei zugänglich)

Es sind keine frei zugänglichen Ergänzenden Materialien verfügbar

Zitation

Kober, J., Wilhelm, A., Oztop, E., & Peters, J. (2012). Reinforcement learning to adjust parametrized motor primitives to new situations. Autonomous Robots, 33(4), 361-379. doi:10.1007/s10514-012-9290-3.

Zitierlink: https://hdl.handle.net/11858/00-001M-0000-0013-B568-7

Zusammenfassung

Humans manage to adapt learned movements very quickly to new situations by generalizing learned behaviors from similar situations. In contrast, robots currently often need to re-learn the complete movement. In this paper, we propose a method that learns to generalize parametrized motor plans by adapting a small set of global parameters, called meta-parameters. We employ reinforcement learning to learn the required meta-parameters to deal with the current situation, described by states. We introduce an appropriate reinforcement learning algorithm based on a kernelized version of the reward-weighted regression. To show its feasibility, we evaluate this algorithm on a toy example and compare it to several previous approaches. Subsequently, we apply the approach to three robot tasks, i.e., the generalization of throwing movements in darts, of hitting movements in table tennis, and of throwing balls where the tasks are learned on several different real physical robots, i.e., a Barrett WAM, a BioRob, the JST-ICORP/SARCOS CBi and a Kuka KR 6.