Multi-robot target reaching using modified Q-learning and PSO

Orawan Watchanupaporn; Peerapun Pudtuan

doi:10.1109/ICCAR.2016.7486700

Multi-robot target reaching using modified Q-learning and PSO

Watchanupaporn, Orawan, Pudtuan, Peerapun

Source

2016 2nd International Conference on Control, Automation and Robotics (ICCAR) > 66 - 69

Abstract

In this paper, a group of mobile robots learns to solve a target reaching problem in a simulated grid environment filled with obstacles. Each robot knows its distance to the target and can communicate with each other. The proposed learning algorithm combines a reinforcement learning algorithm and a swarm optimization algorithm. Q-learning, which is a reinforcement learning algorithm, is modified to learn a policy by specifying rewards and punishment for certain robot actions. Particle swarm optimization (PSO), which is a swarm optimization algorithm, is modified for grid environment and used to accelerate the learning process for multiple robots. The proposed algorithm outperforms the original Q-learning in both training and testing. It learns 2.23 times faster and required 6.52 fewer steps to reach the destination. Moreover, it uses less memory than the original Q-learning. We also experiment on various numbers of robots. The result shows that more robots learn faster in most cases.