English
 
Help Privacy Policy Disclaimer
  Advanced SearchBrowse

Item

ITEM ACTIONSEXPORT

Released

Conference Paper

Planning with Information-Processing Constraints and Model Uncertainty in Markov Decision Processes

MPS-Authors
/persons/resource/persons84522

Grau-Moya,  J
Dept. Empirical Inference, Max Planck Institute for Intelligent Systems, Max Planck Society;
Research Group Sensorimotor Learning and Decision-Making, Max Planck Institute for Biological Cybernetics, Max Planck Society;

/persons/resource/persons192683

Leibfried,  F
Max Planck Institute for Biological Cybernetics, Max Planck Society;
Research Group Sensorimotor Learning and Decision-Making, Max Planck Institute for Biological Cybernetics, Max Planck Society;

/persons/resource/persons84447

Genewein,  T
Research Group Sensorimotor Learning and Decision-Making, Max Planck Institute for Biological Cybernetics, Max Planck Society;
Max Planck Institute for Biological Cybernetics, Max Planck Society;
Research Group Sensorimotor Learning and Decision-making, Max Planck Institute for Intelligent Systems, Max Planck Society;

/persons/resource/persons83827

Braun,  DA
Max Planck Institute for Biological Cybernetics, Max Planck Society;
Research Group Sensorimotor Learning and Decision-Making, Max Planck Institute for Biological Cybernetics, Max Planck Society;
Research Group Sensorimotor Learning and Decision-making, Max Planck Institute for Intelligent Systems, Max Planck Society;

External Resource

Link
(Any fulltext)

Fulltext (restricted access)
There are currently no full texts shared for your IP range.
Fulltext (public)
There are no public fulltexts stored in PuRe
Supplementary Material (public)
There is no public supplementary material available
Citation

Grau-Moya, J., Leibfried, F., Genewein, T., & Braun, D. (2016). Planning with Information-Processing Constraints and Model Uncertainty in Markov Decision Processes. In P. Frasconi, N. Landwehr, G. Manco, & J. Vreeken (Eds.), Machine Learning and Knowledge Discovery in Databases (pp. 475-491). Cham, Switzerland: Springer.


Cite as: https://hdl.handle.net/21.11116/0000-0000-7A78-1
Abstract
Information-theoretic principles for learning and acting have been proposed to solve particular classes of Markov Decision Problems. Mathematically, such approaches are governed by a variational free energy principle and allow solving MDP planning problems with information-processing constraints expressed in terms of a Kullback-Leibler divergence with respect to a reference distribution. Here we consider a generalization of such MDP planners by taking model uncertainty into account. As model uncertainty can also be formalized as an information-processing constraint, we can derive a unified solution from a single generalized variational principle. We provide a generalized value iteration scheme together with a convergence proof. As limit cases, this generalized scheme includes standard value iteration with a known model, Bayesian MDP planning, and robust planning. We demonstrate the benefits of this approach in a grid world simulation.