Author: Liang, Zhixuan
Title: Distributed and hierarchical deep reinforcement learning for multi-robot autonomous cooperation
Advisors: Cao, Jiannong (COMP)
Degree: Ph.D.
Year: 2024
Subject: Robots
Intelligent control systems
Reinforcement learning
Machine learning
Hong Kong Polytechnic University -- Dissertations
Department: Department of Computing
Pages: xviii, 131 pages : color illustrations
Language: English
Abstract: At present, multi-robot systems (MRS) have attracted extensive attention for their application in such settings as package delivery, space exploration and autonomous driving. A fundamental problem in MRS is how multiple robots can cooperate to per­form a common goal or task. The recent development of deep reinforcement learning (DRL) provides a solution to enable robots to learn to cooperate in dynamic and complex environments. However, existing DRL approaches tend to rely on central­ized training and flat neural architecture design, leading to potential issues such as single point of failure, bottleneck of performance, and low learning efficiency. There­fore, the goal of this thesis is to develop and implement effective and efficient DRL approaches for real-world multi-robot cooperation.
Several significant challenges must be surmounted in achieving this goal. First, the primary foundation of DRL is the trial-and-error process; each robot senses the state of the environment, takes an action, and receives a corresponding reward to update its policy network. However, such a trial-and-error process can be prohibitively costly in real-world multi-robot settings. Secondly, the policy search space of each robot can expand significantly as it needs to consider the states and actions of other robots. This expansion leads to a high learning complexity, making it challenging to find optimal cooperative strategies. Third, designing a suitable reward signal for each robot is non-trivial due to the lack of prior knowledge to quantify the individual impact on the team’s cooperation. The conventional approach of assigning a shared global reward may lack fairness and impair learning efficiency.
To tackle these challenges, this thesis starts by designing and developing a train­ing and evaluation platform that incorporates diverse cooperative scenarios, a social agent modeling algorithm, an RL-friendly API design, and a generalization evalua­tion metric. This platform is a benchmark environment for safe training and testing of different DRL approaches. Then, we propose a novel distributed and hierarchical learning approach that includes high-level cooperative decision-making and low-level individual control. The cooperation of multiple robots can be efficiently learned in high-level discrete action space, while the low-level individual control can be reduced to single-agent reinforcement learning. Our approach reduces the learning complex­ity by decomposing the overall task into sub-tasks in a hierarchical way. In addition, we propose a communication-efficient hierarchical reinforcement learning approach to facilitate multi-robot communication in a partially observable environment. Further­more, we propose a novel reward function design, named SVR (Shapley-value-based Reward), inspired by the economics and cooperative game theorem. This method offers a model-free approach to quantify each individual’s contribution, adopting a widely accepted fairness computation in economics. All the above approaches are trained and evaluated in our developed platform. The experiment results demon­strate the effectiveness and efficiency of our approaches in low collision rate, high success lane change rate, and fast convergence speed.
In conclusion, we summarize the lessons we learned during training and developing real-world applications and open questions and explore several future research directions.
Rights: All rights reserved
Access: open access

Files in This Item:
File Description SizeFormat 
7262.pdfFor All Users10.18 MBAdobe PDFView/Open


Copyright Undertaking

As a bona fide Library user, I declare that:

  1. I will abide by the rules and legal ordinances governing copyright regarding the use of the Database.
  2. I will use the Database for the purpose of my research or private study only and not for circulation or further reproduction or any other purpose.
  3. I agree to indemnify and hold the University harmless from and against any loss, damage, cost, liability or expenses arising from copyright infringement or unauthorized usage.

By downloading any item(s) listed above, you acknowledge that you have read and understood the copyright undertaking as stated above, and agree to be bound by all of its terms.

Show full item record

Please use this identifier to cite or link to this item: https://theses.lib.polyu.edu.hk/handle/200/12812