Deutsch
 
Hilfe Datenschutzhinweis Impressum
  DetailsucheBrowse

Datensatz

 
 
DownloadE-Mail
  Fitted Q-iteration by Advantage Weighted Regression

Neumann, G., & Peters, J. (2009). Fitted Q-iteration by Advantage Weighted Regression. Advances in neural information processing systems 21: 22nd Annual Conference on Neural Information Processing Systems 2008, 1177-1184.

Item is

Externe Referenzen

einblenden:

Urheber

einblenden:
ausblenden:
 Urheber:
Neumann, G, Autor
Peters, J1, 2, Autor           
Koller, Herausgeber
D., Herausgeber
Schuurmans, D., Herausgeber
Bengio, Y., Herausgeber
Bottou, L., Herausgeber
Affiliations:
1Department Empirical Inference, Max Planck Institute for Biological Cybernetics, Max Planck Society, ou_1497795              
2Dept. Empirical Inference, Max Planck Institute for Intelligent Systems, Max Planck Society, ou_1497647              

Inhalt

einblenden:
ausblenden:
Schlagwörter: -
 Zusammenfassung: Recently, fitted Q-iteration (FQI) based methods have become more popular due to their increased sample efficiency, a more stable learning process and the higher quality of the resulting policy. However, these methods remain hard to use for continuous action spaces which frequently occur in real-world tasks, e.g., in robotics and other technical applications. The greedy action selection commonly used for the policy improvement step is particularly problematic as it is expensive for continuous actions, can cause an unstable learning process, introduces an optimization bias and results in highly non-smooth policies unsuitable for real-world systems. In this paper, we show that by using a soft-greedy action selection the policy improvement step used in FQI can be simplified to an inexpensive advantage-weighted regression. With this result, we are able to derive a new, computationally efficient FQI algorithm which can even deal with high dimensional action spaces.

Details

einblenden:
ausblenden:
Sprache(n):
 Datum: 2009-06
 Publikationsstatus: Erschienen
 Seiten: -
 Ort, Verlag, Ausgabe: -
 Inhaltsverzeichnis: -
 Art der Begutachtung: -
 Identifikatoren: ISBN: 978-1-605-60949-2
URI: http://nips.cc/Conferences/2008/
BibTex Citekey: 5520
 Art des Abschluß: -

Veranstaltung

einblenden:
ausblenden:
Titel: Twenty-Second Annual Conference on Neural Information Processing Systems (NIPS 2008)
Veranstaltungsort: Vancouver, BC, Canada
Start-/Enddatum: -

Entscheidung

einblenden:

Projektinformation

einblenden:

Quelle 1

einblenden:
ausblenden:
Titel: Advances in neural information processing systems 21 : 22nd Annual Conference on Neural Information Processing Systems 2008
Genre der Quelle: Zeitschrift
 Urheber:
Affiliations:
Ort, Verlag, Ausgabe: Red Hook, NY, USA : Curran
Seiten: - Band / Heft: - Artikelnummer: - Start- / Endseite: 1177 - 1184 Identifikator: -