English
 
Help Privacy Policy Disclaimer
  Advanced SearchBrowse

Item

ITEM ACTIONSEXPORT
 
 
DownloadE-Mail
  Solving Deep Memory POMDPs with Recurrent Policy Gradients

Wierstra, D., Förster A, Peters, J., & Schmidhuber, J. (2007). Solving Deep Memory POMDPs with Recurrent Policy Gradients. Artificial Neural Networks: ICANN 2007, 697-706.

Item is

Files

show Files

Locators

show

Creators

show
hide
 Creators:
Wierstra, D, Author
Förster A, Peters, J1, 2, Author           
Schmidhuber, J, Author
Affiliations:
1Department Empirical Inference, Max Planck Institute for Biological Cybernetics, Max Planck Society, ou_1497795              
2Dept. Empirical Inference, Max Planck Institute for Intelligent Systems, Max Planck Society, ou_1497647              

Content

show
hide
Free keywords: -
 Abstract: This paper presents Recurrent Policy Gradients, a modelfree reinforcement learning (RL) method creating limited-memory stochastic policies for partially observable Markov decision problems (POMDPs) that require long-term memories of past observations. The approach involves approximating a policy gradient for a Recurrent Neural Network (RNN) by backpropagating return-weighted characteristic eligibilities through time. Using a “Long Short-Term Memory” architecture, we are able to outperform other RL methods on two important benchmark tasks. Furthermore, we show promising results on a complex car driving simulation task.

Details

show
hide
Language(s):
 Dates: 2007-09
 Publication Status: Issued
 Pages: -
 Publishing info: -
 Table of Contents: -
 Rev. Type: -
 Degree: -

Event

show
hide
Title: International Conference on Artificial Neural Networks
Place of Event: Porto, Portugal
Start-/End Date: -

Legal Case

show

Project information

show

Source 1

show
hide
Title: Artificial Neural Networks: ICANN 2007
Source Genre: Journal
 Creator(s):
Affiliations:
Publ. Info: Berlin, Germany : Springer
Pages: - Volume / Issue: - Sequence Number: - Start / End Page: 697 - 706 Identifier: -