Kurzbeschreibung

In the last couple of years, a lot of progress has been made and increasingly good results can be achieved in the Reinforcement Learning domain. The results of different algorithms can vary greatly amongst different problems. Additionally, hyperparameters such as the learning rate and different optimizers can play a big role in achieving the results. Consequently, fine tuning has to be done to achieve the best result. A recent very popular algorithms has been the Proximal Policy Optimization (PPO) (1). In this paper PPO is tested on different environments with varying optimizers and also with a varying Learning Rate, to examine the difference, to ultimately understand the differences and the different results better.

Dokumentation


Dateien


  • Keine Stichwörter