Skip to main content

Adaptive Traffic Signal Control Using Multi-Agent Reinforcement Learning: A Comparison of Control Strategies

Research Abstract

Urban traffic congestion remains a persistent challenge for conventional fixed-time signal control, particularly under fluctuating and asymmetric demand. Although multi-agent reinforcement learning (MARL) has shown promise for adaptive traffic signal control, previous studies have often focused on isolated intersections, simplified synthetic networks, or deep-learning-based controllers without systematically comparing tabular and deep-value-based multi-agent approaches under equivalent operating conditions. This study addresses this gap by comparing three traffic signal control strategies: fixed-time control, Multi-Agent Tabular Q-Learning, and multi-agent Deep Q-Network control (MADQN). The evaluation was conducted in a microscopic traffic simulation environment using two complementary testbeds: a synthetic two-intersection corridor, which enables controlled analysis of multi-agent coordination, and a real-world digital twin of the 25 January Corridor in Assiut, Egypt, which tests controller robustness under asymmetric geometry and realistic turning movements. The controllers are assessed under low-, medium-, and high-demand scenarios using queue length, cumulative delay, and Time-To-Collision as operational and safety-related indicators. The results show that MARL-based controllers generally outperform fixed-time control, but their relative performance depends on demand intensity and network complexity. MADQN provides stronger generalization in low-demand and queue-dissipation conditions, whereas Tabular Q-Learning remains highly competitive and can achieve superior delay reduction in several medium- and high-demand cases. These findings indicate that deeper MARL architectures are not universally superior; rather, adaptive signal control deployment should match the controller architecture to the operational objective, traffic demand regime, and practical complexity of the target corridor.

Research Authors
Mahmoud Owais,Badr O. Mohammed, Abdulrahman A. Kamal Abdulrahman A. Kamal, Abdulrahman Shaban, Ahmed H. Mostafa,Kareem Hatem,John Emad,Salah T. Younis ,Samia A. Ali, Alaa E. Abdel-Hakim, Islam M. Alkabbany
Research Date
Research Department
Research Journal
Sustainability
Research Pages
5702
Research Publisher
MDPI
Research Rank
Q2
Research Vol
18 (11)
Research Website
https://doi.org/10.3390/su18115702
Research Year
2026