References

[1]

Bellman Richard Ernest, Stability theory of differential equations /. New York : McGraw-Hill, 1953.

[2]

S. Zhao, “Mathematical foundations of reinforcement learning.” 2023. https://github.com/MathFoundationRL/Book-Mathmatical-Foundation-of-Reinforcement-Learning (accessed Mar. 30, 2023).

[3]

R. S. Sutton and A. G. Barto, Reinforcement learning: An introduction, Second. The MIT Press, 2018. Available: http://incompleteideas.net/book/the-book-2nd.html

[4]

R. J. Williams, “Simple statistical gradient-following algorithms for connectionist reinforcement learning,” Machine Learning, vol. 8, no. 3, pp. 229–256, May 1992, doi: 10.1007/BF00992696.

[5]

T. P. Lillicrap et al., “Continuous control with deep reinforcement learning.” 2019. Available: https://arxiv.org/abs/1509.02971

[6]

J. Schulman, F. Wolski, P. Dhariwal, A. Radford, and O. Klimov, “Proximal policy optimization algorithms,” CoRR, vol. abs/1707.06347, 2017, Available: http://arxiv.org/abs/1707.06347

[7]

A. Iserles, A first course in the numerical analysis of differential equations /, 2. ed. in Cambridge texts in applied mathematics. Cambridge ; Cambridge University Press, 2009.

[8]

P. Birken, “Numerical methods for stiff problems.” Lecture Notes, 2022.

[9]

M. Andrychowicz et al., “Learning to learn by gradient descent by gradient descent,” CoRR, vol. abs/1606.04474, 2016, Available: http://arxiv.org/abs/1606.04474

[10]

A. Fawzi et al., “Discovering faster matrix multiplication algorithms with reinforcement learning,” Nature, vol. 610, no. 7930, pp. 47–53, Oct. 2022, doi: 10.1038/s41586-022-05172-4.

[11]

C. Mahoney, “Reinforcement learning: A review of the historic, modern, and future applications of this special form of machine learning.” https://towardsdatascience.com/reinforcement-learning-fda8ff535bb6, 2021.

[12]

D. Silver et al., “Mastering the game of go with deep neural networks and tree search,” Nature, vol. 529, no. 7587, pp. 484–489, Jan. 2016, doi: 10.1038/nature16961.

[13]

A. Atangana, Fractional operators with constant and variable order with application to geo-hydrology. Academic Press, 2018. Available: https://ludwig.lub.lu.se/login?url=https://www.sciencedirect.com/science/book/9780128096703

[14]

R. Bellman, R. E. Bellman, and R. Corporation, Dynamic programming. in Rand corporation research study. Princeton University Press, 1957. Available: https://books.google.se/books?id=rZW4ugAACAAJ

[15]

W. A. Adkins, M. G. Davidson, and S. (Online service), Ordinary differential equations. in Undergraduate texts in mathematics,. New York, NY : Springer New York :, 2012. Available: http://dx.doi.org/10.1007/978-1-4614-3618-8

[16]

Wolfgang. Hackbusch and S. (Online service), Iterative solution of large sparse systems of equations /, 2nd ed. 2016. in Applied mathematical sciences,. Cham : Springer International Publishing :, 2016. Available: http://dx.doi.org/10.1007/978-3-319-28483-5

[17]

E. Ludvig, M. Bellemare, and K. Pearson, “A primer on reinforcement learning in the brain: Psychological, computational, and neural perspectives,” in Computational Neuroscience for Advancing Artificial Intelligence: Models, Methods and Applications, 2011, pp. 111–144. doi: 10.4018/978-1-60960-021-1.ch006.

[18]

J. Tromp, “Counting legal positions in go — tromp.github.io.” https://tromp.github.io/go/legal.html.

[19]

S. M. Ross and E. A. Peköz, A second course in probability. ProbabilityBookstore.com, 2007. Available: https://books.google.se/books?id=g5j6DwAAQBAJ

[20]

L. N. Trefethen and D. Bau, Numerical linear algebra /. Philadelphia : SIAM, Society for Industrial; Applied Mathematics, cop. 1997.

[21]

S. Cuomo, V. S. di Cola, F. Giampaolo, G. Rozza, M. Raissi, and F. Piccialli, “Scientific machine learning through physics-informed neural networks: Where we are and what’s next.” 2022. Available: https://arxiv.org/abs/2201.05624

[22]

R. Rombach, A. Blattmann, D. Lorenz, P. Esser, and B. Ommer, “High-resolution image synthesis with latent diffusion models.” 2022. Available: https://arxiv.org/abs/2112.10752

[23]

J. Kober, J. Bagnell, and J. Peters, “Reinforcement learning in robotics: A survey,” The International Journal of Robotics Research, vol. 32, pp. 1238–1274, Sep. 2013, doi: 10.1177/0278364913495721.

[24]

B. Hambly, R. Xu, and H. Yang, “Recent advances in reinforcement learning in finance,” Mathematical Finance, vol. n/a, no. n/a, doi: https://doi.org/10.1111/mafi.12382.

[25]

X. Chen, L. Yao, J. McAuley, G. Zhou, and X. Wang, “A survey of deep reinforcement learning in recommender systems: A systematic review and future directions.” 2021. Available: https://arxiv.org/abs/2109.03540