Hilfe Wegweiser Impressum Kontakt Einloggen





Predictive Representations For Sequential Decision Making Under Uncertainty


Boularias,  A
Department Empirical Inference, Max Planck Institute for Biological Cybernetics, Max Planck Society;

Externe Ressourcen
Es sind keine Externen Ressourcen verfügbar
Volltexte (frei zugänglich)
Es sind keine frei zugänglichen Volltexte verfügbar
Ergänzendes Material (frei zugänglich)
Es sind keine frei zugänglichen Ergänzenden Materialien verfügbar

Boularias, A. (2010). Predictive Representations For Sequential Decision Making Under Uncertainty.

The problem of making decisions is ubiquitous in life. This problem becomes even more complex when the decisions should be made sequentially. In fact, the execution of an action at a given time leads to a change in the environment of the problem, and this change cannot be predicted with certainty. The aim of a decision-making process is to optimally select actions in an uncertain environment. To this end, the environment is often modeled as a dynamical system with multiple states, and the actions are executed so that the system evolves toward a desirable state. In this thesis, we proposed a family of stochastic models and algorithms in order to improve the quality of of the decision-making process. The proposed models are alternative to Markov Decision Processes, a largely used framework for this type of problems. In particular, we showed that the state of a dynamical system can be represented more compactly if it is described in terms of predictions of certain future events. We also showed that even the cognitive process of selecting actions, known as policy, can be seen as a dynamical system. Starting from this observation, we proposed a panoply of algorithms, all based on predictive policy representations, in order to solve different problems of decision-making, such as decentralized planning, reinforcement learning, or imitation learning. We also analytically and empirically demonstrated that the proposed approaches lead to a decrease in the computational complexity and an increase in the quality of the decisions, compared to standard approaches for planning and learning under uncertainty.