Peter Regier, Lukas Gesing and Maren Bennewitz, University of Bonn, Germany
Collision-free motion is essential for mobile robots. Most approaches to collision-free and efficient navigation with wheeled robots require parameter tuning by experts to obtain good navigation behavior. In this paper, we aim at learning an optimal navigation policy by deep reinforcement learning to overcome this manual parameter tuning. Our approach uses proximal policy optimization to train the policy and achieve collision-free and goal-directed behavior. The output of the learned network are the robot’s translational and angular velocities for the next time step. Our method combines path planning on a 2D grid with reinforcement learning and does not need any supervision. Our network is first trained in a simple environment and then transferred to scenarios of increasing complexity. We implemented our approach in C++ and Python for the Robot Operating System (ROS) and thoroughly tested it in several simulated as well as real-world experiments. The experiments illustrate that our trained policy can be applied to solve complex navigation tasks. Furthermore, we compare the performance of our learned controller to the popular dynamic window approach (DWA) of ROS. As the experimental results show, a robot controlled by our learned policy reaches the goal significantly faster compared to using the DWA by closely bypassing obstacles and thus saving time.