Peters, J. Department Empirical Inference, Max Planck Institute for Biological Cybernetics, Max Planck Society; Max Planck Institute for Biological Cybernetics, Max Planck Society;
https://academic.oup.com/jigpal/article-pdf/18/5/620/2255078/jzp049.pdf (Publisher version)
Wierstra, D., Förster, A., Peters, J., & Schmidhuber, J. (2010). Recurrent Policy Gradients. Logic Journal of the IGPL, 18(5), 620-634. doi:10.1093/jigpal/jzp049.