Reinforcement Learning for Operational Space Control

Peters, J

doi:10.1109/ROBOT.2007.363633

DetailsSummary

Reinforcement Learning for Operational Space Control

Peters, J. (2007). Reinforcement Learning for Operational Space Control. In 2007 IEEE International Conference on Robotics and Automation (ICRA 2007) (pp. 2111-2116). Los Alamitos, CA, USA: IEEE Computer Society.

Item is Released

show all hide all

Basic

show hide

Item Permalink: https://hdl.handle.net/11858/00-001M-0000-0013-CE27-C Version Permalink: https://hdl.handle.net/11858/00-001M-0000-0013-CE28-A

Genre: Conference Paper

Files

show Files

Locators

show

Creators

show

hide

Creators:
Peters, J^{1, 2}, Author

Affiliations:
1Department Empirical Inference, Max Planck Institute for Biological Cybernetics, Max Planck Society, ou_1497795
2Dept. Empirical Inference, Max Planck Institute for Intelligent Systems, Max Planck Society, ou_1497647

Content

show

hide

Free keywords: -

Abstract: While operational space control is of essential importance for robotics and well-understood from an analytical point of view, it can be prohibitively hard to achieve accurate control in face of modeling errors, which are inevitable in complex robots, e.g., humanoid robots. In such cases, learning control methods can offer an interesting alternative to analytical control algorithms. However, the resulting supervised learning problem is ill-defined as it requires to learn an inverse mapping of a usually redundant system, which is well known to suffer from the property of non-convexity of the solution space, i.e., the learning system could generate motor commands that try to steer the robot into physically impossible configurations. The important insight that many operational space control algorithms can be reformulated as optimal control problems, however, allows addressing this inverse learning problem in the framework of reinforcement learning. However, few of the known optimization or reinforcement learning algorithms can be used in online learning control for robots, as they are either prohibitively slow, do not scale to interesting domains of complex robots, or require trying out policies generated by random search, which are infeasible for a physical system. Using a generalization of the EM-based reinforcement learning framework suggested by Dayan amp; Hinton, we reduce the problem of learning with immediate rewards to a reward-weighted regression problem with an adaptive, integrated reward transformation for faster convergence. The resulting algorithm is efficient, learns smoothly without dangerous jumps in solution space, and works well in applications of complex high degree-of-freedom robots.

Details

show

hide

Language(s):

Dates: Date issued: 2007-04

Publication Status: Issued

Pages: -

Publishing info: -

Table of Contents: -

Rev. Type: -

Identifiers: URI: http://www.icra07.org/
DOI: 10.1109/ROBOT.2007.363633
BibTex Citekey: 4723

Degree: -

Event

show

hide

Title: 2007 IEEE International Conference on Robotics and Automation (ICRA 2007)

Place of Event: Roma, Italy

Start-/End Date: -

Legal Case

show

Project information

show

Source 1

show

hide

Title: 2007 IEEE International Conference on Robotics and Automation (ICRA 2007)

Source Genre: Proceedings

Creator(s):

Affiliations:

Publ. Info: Los Alamitos, CA, USA : IEEE Computer Society

Pages: - Volume / Issue: - Sequence Number: - Start / End Page: 2111 - 2116 Identifier: -