Peters, J Department Empirical Inference, Max Planck Institute for Biological Cybernetics, Max Planck Society; Max Planck Institute for Biological Cybernetics, Max Planck Society;
https://www.sciencedirect.com/science/article/pii/S0893608009000045 (Verlagsversion)
Hachiya, H., Akiyama, T., Sugiyama, M., & Peters, J. (2009). Adaptive Importance Sampling for Value Function Approximation in Off-policy Reinforcement Learning. Neural networks, 22(10), 1399-1410. doi:10.1016/j.neunet.2009.01.002.