Hierarchical Relative Entropy Policy Search

Daniel, C; Neumann, G; Peters, J

Item

ITEM ACTIONSEXPORT

Add to Basket

Local TagsRelease HistoryDetailsSummary

Released

Conference Paper

Hierarchical Relative Entropy Policy Search

MPS-Authors

/persons/resource/persons84135

Peters, J
Dept. Empirical Inference, Max Planck Institute for Intelligent Systems, Max Planck Society;

External Resource

http://proceedings.mlr.press/v22/daniel12/daniel12.pdf
(Publisher version)

Fulltext (restricted access)

There are currently no full texts shared for your IP range.

Fulltext (public)

There are no public fulltexts stored in PuRe

Supplementary Material (public)

There is no public supplementary material available

Citation

Daniel, C., Neumann, G., & Peters, J. (2012). Hierarchical Relative Entropy Policy Search. In N. Lawrence, & M. Girolami (Eds.), Artificial Intelligence and Statistics, 21-23 April 2012, La Palma, Canary Islands (pp. 273-281). Madison, WI, USA: International Machine Learning Society.

Cite as: https://hdl.handle.net/11858/00-001M-0000-0013-B7E8-8

Abstract

Many real-world problems are inherently hi- erarchically structured. The use of this struc- ture in an agent's policy may well be the key to improved scalability and higher per- formance. However, such hierarchical struc- tures cannot be exploited by current policy search algorithms. We will concentrate on a basic, but highly relevant hierarchy - the 'mixed option' policy. Here, a gating network first decides which of the options to execute and, subsequently, the option-policy deter- mines the action. In this paper, we reformulate learning a hi- erarchical policy as a latent variable estima- tion problem and subsequently extend the Relative Entropy Policy Search (REPS) to the latent variable case. We show that our Hierarchical REPS can learn versatile solu- tions while also showing an increased perfor- mance in terms of learning speed and quality of the found policy in comparison to the non- hierarchical approach.