عملکرد یادگیری تقویتی عمیق در کنترل تطبیقی فاز گردش به چپ چراغ راهنمایی

نوع مقاله : پژوهشی

نویسندگان

دانشکده‌ی مهندسی عمران، دانشکدگان فنی، دانشگاه تهران، تهران.

10.24200/j30.2024.64476.3329

چکیده

در نوشتار حاضر، عملکرد دو روش یادگیری تقویتی، شبکه‌ی Q عمیق دوئل دوگانه و شبکه‌ی Q عمیق استاندارد در کنترل تطبیقی فاز گردش به چپ چراغ‌های راهنمایی در یک تقاطع شهری مقایسه شده‌اند. روش‌های مقدار- محور ذکرشده، با استفاده از بهینه‌سازی در یادگیری تقویتی، مدت زمان سبز هر فاز را تعیین و یکی از دو فاز گردش به چپ محافظت‌شده یا مجاز را برای سیکل بعدی انتخاب می‌کنند. شبیه‌سازی‌ها برای حالت‌های توزیع یکنواخت و متغیر جریان خودروها و با دو جریان ترافیک سبک و سنگین انجام شده و نتایج نشان داده‌اند که الگوریتم شبکه‌‌ی عمیق دوئل دوگانه در فرایند یادگیری مؤثرتر از الگوریتم شبکه‌ی Q استاندارد عمل کرده است. همچنین، یادگیری با شبکه‌یQ  دوئل دوگانه توانسته است طول صف تجمعی وسائط نقلیه را در تمام حالت‌های‌ شبیه‌سازی، دست‌کم به میزان ۲۶٪ کاهش دهد و جریان ترافیک را بهبود بخشد. کاهش اخیر در حالت جریان ترافیک سنگین و یکنواخت بیشتر از سایر حالت‌ها بوده و به ۶۷٪ رسیده است. پژوهش حاضر می‌تواند نقش مهمی در توسعه‌ی سیستم‌های هوشمند کنترل ترافیک ایفا ‌کند.

کلیدواژه‌ها

موضوعات


عنوان مقاله [English]

Performance of Deep Reinforcement Learning for Adaptive Left-Turn Phase Traffic Light Control

نویسندگان [English]

  • Elham Golpayegani
  • Abbas Babazadeh
  • Omid Nayeri
School of Civil Engineering, College of Engineering, University of Tehran, Tehran, Iran
چکیده [English]

As traffic conditions become more complex and demanding, traditional methods of traffic signal control often fall short. The application of artificial intelligence and machine learning algorithms to traffic light timing has proven to be highly promising. This research uses reinforcement learning to manage traffic light phases automatically and efficiently, enhancing traffic flow and reducing intersection queue lengths. This paper examines the effectiveness of deep reinforcement learning techniques in optimizing the adaptive control of left-turn phases at urban intersections. The study introduces two deep reinforcement learning algorithms and compares the performance of the Double Dueling Deep Q-Network (3DQN) with the standard Deep Q-Network (DQN). These value-based methods in our proposed method use reinforcement learning optimization to determine the green duration for each phase and select either the protected or permitted left-turn phase for the next cycle. The adaptive control system adjusts traffic light timings in real-time without human intervention, ensuring smoother and more efficient traffic flow, significantly reducing queue lengths. The 3DQN algorithm uses a target network that updates target Q values at a slower rate to stabilize training and minimize errors. The dueling network splits the neural network into two parts: one to estimate the expected reward and the other to assess the relative importance of each action. Simulations were conducted with both uniform and variable car flow distributions, under light and heavy traffic volumes. They show that controllers using the 3DQN algorithm outperform the DQN algorithm. The results also reveal that the 3DQN algorithm can reduce cumulative vehicle queue lengths by at least 26% in all cases, and up to 67% in scenarios with heavy and uniform traffic flow. This research is crucial in developing intelligent traffic control systems and reducing traffic delays. The study highlights the potential of adaptive control systems using reinforcement learning to optimize traffic light timings and mitigate vehicle queue lengths, supporting the advancement of intelligent traffic control systems capable of adapting to dynamic urban conditions.

کلیدواژه‌ها [English]

  • Adaptive traffic light control
  • left-turn phase
  • reinforcement learning
  • double dueling deep Q-network
1. Mannion, P., Duggan, J., and Howley, E., 2016. An experimental review of reinforcement learning algorithms for adaptive traffic signal control. Autonomic Road Transport Support Systems, pp.47-66, doi: 10.1007/978-3-319-25808-9_4.
2. Eriksen, A. B., Lahrmann, H., Larsen, K. G., and Taankvist, J. H., 2020. Controlling signalized intersections using machine learning. Transportation Research Procedia, 48, pp.987-997, doi: 10.1016/j.trpro.2020.08.127.
3. Mousavi, S. S., Schukat, M., and Howley, E., 2017. Traffic light control using deep policy-gradient and value-function based reinforcement learning. IET Intelligent Transportation Systems, 11, pp.417-423. doi: 10.1049/iet-its.2017.0153.
4. Miletić, M., Ivanjko, E., Gregurić, M., and Kušić, K., 2022. A review of reinforcement learning applications in adaptive traffic signal control. IET Intelligent Transportation Systems, 16, pp.1269-1285. doi: 10.1049/itr2.12208.
5. Chin, Y.K., Lee, L.K., Bolong, N., Yang, S.S. and Teo, K.T.K., 2011. Exploring Q-learning optimization in traffic signal timing plan management. In Proceedings of the 3rd International Conference on Computational Intelligence, Communication Systems and Networks, Bali, Indonesia, pp.269-274, doi: 10.1109/CICSyN.2011.64.
6. Haydari, A., and Yilmaz, Y., 2020. Deep reinforcement learning for intelligent transportation systems: a survey. IEEE Transactions on Intelligent Transportation Systems, 23(1), pp.11-32. doi: 10.1109/TITS.2020.3008612.
7. Sutton, R. S., and Barto, A. G., 2018. Introduction. In Reinforcement learning: an introduction (2nd ed., ch. 1, sec. 1.1, pp. 1-2). Cambridge, Massachusetts (London, England): The MIT Press.
8. Touhbi, S., Babram, M.A., Nguyen-Huu, T., Marilleaub, N., Hbid, M.L., Cambier, C. and Stinckwich, S., 2017. Adaptive traffic signal control exploring reward definition for reinforcement learning. In Proceedings of the 8th International Conference on Ambient Systems, Networks and Technologies, Madeira, Portugal, pp.513-520, doi: 10.1016/j.procs.2017.05.327.
9. Zheng, G., Zang, X., Xu, N., Wei, H., Yu, Z., Gayah, V., Xu, K. and Li, Z., 2019. Diagnosing reinforcement learning for traffic signal control, arXiv preprint arXiv: 1905.04716.
10. La, P. and S. Bhatnagar, 2011. Reinforcement learning with function approximation for traffic signal control. IEEE Transactions on Intelligent Transportation Systems, 12(2): pp.412-421. doi: 10.1109/TITS.2010.2076327.
11. Genders, W. and Razavi, S., 2016. Using a deep reinforcement learning agent for traffic signal control. arXiv preprint arXiv: arXiv:1611.01142.
12. Wei, H., Zheng, G., Yao, H. and Li, Z., 2018. IntelliLight: a reinforcement learning approach for intelligent traffic light control. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, London, United Kingdom, pp.2496-2505, doi: 10.1145/3219819.3220096.
13. Zheng, Q., Xu, H., Chen, J., Zhang, D., Zhang, K., & Tang, G., 2022. Double deep Q-network with dynamic bootstrapping for real-time isolated signal control: a traffic engineering perspective. Applied Sciences, 12(17), p.8641.‏ doi: 10.3390/app1217864.
14. Stamatiadis, N., Tate, S., and Kirk, A., 2016. Left-turn phasing decisions based on conflict analysis. Transportation Research Procedia, 14, 3390-3398. doi: 10.1016/j.trpro.2016.05.291.
15. Han, G., Zheng, Q., Liao, L., Tang, P., Li, Z., and Zhu, Y., 2022. Deep reinforcement learning for intersection signal control considering pedestrian behavior. Journal of Electronics, 11(21), p.3519. doi: 10.3390/electronics11213519.
16. Afandizadeh, Sh., Tavakoli Kashani, A. & Hasanpour, Sh., 2014. A model for safety prioritization of at-grade intersections. Rahvar Research Studies, 9(3), pp.111–138, doi: noo.rs/PBTM2. [in Persian]
17. van Hasselt, H., Guez, A. and Silver, D. 2016., Deep reinforcement learning with double Q-learning, In Proceedings of the 30th AAAI Conference on Artificial Intelligence, Phoenix, Arizona, USA, doi: 10.1609/aaai.v30i1.10295.
18. Wang, Z., Schaul, T., Hessel, M., van Hasselt, H., Lanctot, M. and de Freitas, N., 2016. Dueling network architectures for deep reinforcement learning. In Proceedings of the 33rd International Conference on Machine Learning, New York, NY, USA, pp.1995-2003, arXiv preprint arXiv: 10.48550/arXiv.1511.06581