تجاوز إلى المحتوى الرئيسي

Adaptive Traffic Signal Control Using Multi-Agent Reinforcement Learning: A Comparison of Control Strategies

ملخص البحث

Urban traffic congestion remains a persistent challenge for conventional fixed-time signal control, particularly under fluctuating and asymmetric demand. Although multi-agent reinforcement learning (MARL) has shown promise for adaptive traffic signal control, previous studies have often focused on isolated intersections, simplified synthetic networks, or deep-learning-based controllers without systematically comparing tabular and deep-value-based multi-agent approaches under equivalent operating conditions. This study addresses this gap by comparing three traffic signal control strategies: fixed-time control, Multi-Agent Tabular Q-Learning, and multi-agent Deep Q-Network control (MADQN). The evaluation was conducted in a microscopic traffic simulation environment using two complementary testbeds: a synthetic two-intersection corridor, which enables controlled analysis of multi-agent coordination, and a real-world digital twin of the 25 January Corridor in Assiut, Egypt, which tests controller robustness under asymmetric geometry and realistic turning movements. The controllers are assessed under low-, medium-, and high-demand scenarios using queue length, cumulative delay, and Time-To-Collision as operational and safety-related indicators. The results show that MARL-based controllers generally outperform fixed-time control, but their relative performance depends on demand intensity and network complexity. MADQN provides stronger generalization in low-demand and queue-dissipation conditions, whereas Tabular Q-Learning remains highly competitive and can achieve superior delay reduction in several medium- and high-demand cases. These findings indicate that deeper MARL architectures are not universally superior; rather, adaptive signal control deployment should match the controller architecture to the operational objective, traffic demand regime, and practical complexity of the target corridor.

مؤلف البحث
Mahmoud Owais,Badr O. Mohammed, Abdulrahman A. Kamal Abdulrahman A. Kamal, Abdulrahman Shaban, Ahmed H. Mostafa,Kareem Hatem,John Emad,Salah T. Younis ,Samia A. Ali, Alaa E. Abdel-Hakim, Islam M. Alkabbany
تاريخ البحث
مجلة البحث
Sustainability
صفحات البحث
5702
الناشر
MDPI
تصنيف البحث
Q2
عدد البحث
18 (11)
موقع البحث
https://doi.org/10.3390/su18115702
سنة البحث
2026