dc.titleDistributed and hierarchical deep reinforcement learning for multi-robot autonomous cooperationen_US
dcterms.abstractAt present, multi-robot systems (MRS) have attracted extensive attention for their application in such settings as package delivery, space exploration and autonomous driving. A fundamental problem in MRS is how multiple robots can cooperate to per­form a common goal or task. The recent development of deep reinforcement learning (DRL) provides a solution to enable robots to learn to cooperate in dynamic and complex environments. However, existing DRL approaches tend to rely on central­ized training and flat neural architecture design, leading to potential issues such as single point of failure, bottleneck of performance, and low learning efficiency. There­fore, the goal of this thesis is to develop and implement effective and efficient DRL approaches for real-world multi-robot cooperation.en_US
dcterms.abstractSeveral significant challenges must be surmounted in achieving this goal. First, the primary foundation of DRL is the trial-and-error process; each robot senses the state of the environment, takes an action, and receives a corresponding reward to update its policy network. However, such a trial-and-error process can be prohibitively costly in real-world multi-robot settings. Secondly, the policy search space of each robot can expand significantly as it needs to consider the states and actions of other robots. This expansion leads to a high learning complexity, making it challenging to find optimal cooperative strategies. Third, designing a suitable reward signal for each robot is non-trivial due to the lack of prior knowledge to quantify the individual impact on the team’s cooperation. The conventional approach of assigning a shared global reward may lack fairness and impair learning efficiency.en_US
dcterms.abstractTo tackle these challenges, this thesis starts by designing and developing a train­ing and evaluation platform that incorporates diverse cooperative scenarios, a social agent modeling algorithm, an RL-friendly API design, and a generalization evalua­tion metric. This platform is a benchmark environment for safe training and testing of different DRL approaches. Then, we propose a novel distributed and hierarchical learning approach that includes high-level cooperative decision-making and low-level individual control. The cooperation of multiple robots can be efficiently learned in high-level discrete action space, while the low-level individual control can be reduced to single-agent reinforcement learning. Our approach reduces the learning complex­ity by decomposing the overall task into sub-tasks in a hierarchical way. In addition, we propose a communication-efficient hierarchical reinforcement learning approach to facilitate multi-robot communication in a partially observable environment. Further­more, we propose a novel reward function design, named SVR (Shapley-value-based Reward), inspired by the economics and cooperative game theorem. This method offers a model-free approach to quantify each individual’s contribution, adopting a widely accepted fairness computation in economics. All the above approaches are trained and evaluated in our developed platform. The experiment results demon­strate the effectiveness and efficiency of our approaches in low collision rate, high success lane change rate, and fast convergence speed.en_US
dcterms.abstractIn conclusion, we summarize the lessons we learned during training and developing real-world applications and open questions and explore several future research directions.en_US
