Riedmiller, M., Peters, J., & Schaal, S. (2007). Evaluation of Policy Gradient
Methods and Variants on the Cart-Pole Benchmark. In 2007 IEEE International Symposium on Approximate
Dynamic Programming and Reinforcement Learning (pp. 254-261). Los Alamitos, CA, USA: IEEE Computer Society.