Author: Long, Meng
Title: Deep reinforcement learning for transit signal priority in a connected environment
Advisors: Chung, Edward (EEE)
Gu, Weihua (EEE)
Degree: Ph.D.
Year: 2024
Department: Department of Electrical and Electronic Engineering
Pages: 79 pages : color illustrations
Language: English
Abstract: The growing urbanization and increasing population break the balance of travel demand and road capacity in metropolises, causing heavy traffic congestion. Public transit can transport more people than private vehicles while occupying fewer road resources. Improving transit service can effectively alleviate traffic congestion as better transit service can attract more travelers by buses from private cars. Moreover, traffic lights at intersections bring signal delays, which even become the primary source of transit delays and harm bus efficiency and reliability. Hence, transit signal priority (TSP) is an essential measure to reduce traffic congestion and promote transit reliability at signalized intersections. The emerging connected vehicle technology and reinforcement learning (RL) algorithms provide the opportunity for more intelligent TSP strategies due to more detailed and accurate information and more robust algorithms.
The first work proposes an extended Dueling Double Deep Q-learning with Invalid action masking (eD3QNI) algorithm for TSP strategy at isolated intersections in a connected environment. The algorithm considers multiple conflicting bus priority requests and the constraints on the traffic light and phase skipping rule, aiming to improve the person delay of buses. Simulation results demonstrate that eD3QNI produces lower average person delay and schedule delay than other methods. It also shows that the invalid action masking (IAM) method is superior to the usual variable decision points (VDP) method in terms of high convergence speed, effective performance improvement, and application of domain knowledge on the RL algorithm.
To extend the above work into a multi-intersection environment, the second work develops a Cooperative TSP strategy of Variable phase (CTSPV) by MARL to improve transit schedule adherence at arterial roads. Agents determine the phase of the next step so that the phase sequence and duration are varied with the real-time traffic conditions considering the trade-off between transits and non-transits, the multiple conflicting bus requests, and the cooperation between different agents. Three kinds of traffic constraints are tested, and their results verified the necessity of proper restrictions on RL to guarantee experience quality and training efficiency. This work analyzes the signal timing pattern difference between CTSPV and fixed-time signal and proves the good performance of RL-learned knowledge.
As headway regularity is another essential indicator for transit reliability besides schedule adherence, the third work develops a Cooperative TSP strategy of Variable phase for Headway adherence (CTSPVH) to improve transit headway adherence at arterial roads. The proposed approach considers four critical aspects, i.e., complicated states with multiple conflicting bus requests, rational actions constrained by domain knowledge, comprehensive rewards balancing buses and cars, and a collaborative training scheme among agents. They are correspondingly addressed by proper state representation, IAM algorithms to mask out irrational actions, and reward functions formulated by general traffic queue and transit headway deviation.
Those three strategies solve the key problems of TSP approaches and incorporate traffic domain knowledge with RL to ensure action rationality, avoiding severe congestion and serious accidents caused by irrational actions. Therefore, those three intelligent and safe strategies have bright prospects in practical applications to improve transit efficiency and reliability.
Rights: All rights reserved
Access: open access

Files in This Item:
File Description SizeFormat 
7535.pdfFor All Users2.47 MBAdobe PDFView/Open


Copyright Undertaking

As a bona fide Library user, I declare that:

  1. I will abide by the rules and legal ordinances governing copyright regarding the use of the Database.
  2. I will use the Database for the purpose of my research or private study only and not for circulation or further reproduction or any other purpose.
  3. I agree to indemnify and hold the University harmless from and against any loss, damage, cost, liability or expenses arising from copyright infringement or unauthorized usage.

By downloading any item(s) listed above, you acknowledge that you have read and understood the copyright undertaking as stated above, and agree to be bound by all of its terms.

Show full item record

Please use this identifier to cite or link to this item: https://theses.lib.polyu.edu.hk/handle/200/13086