Volume 25 Issue 4
Aug.  2025
Turn off MathJax
Article Contents
ZHAO Hong-xing, WANG Yu-jie, NIE Jiang-long, LIANG Rui-yan, HE Rui-chun. Deep reinforcement learning signal continuous control of intersection based on cellular deduction multi-step decision mechanism[J]. Journal of Traffic and Transportation Engineering, 2025, 25(4): 296-310. doi: 10.19818/j.cnki.1671-1637.2025.04.021
Citation: ZHAO Hong-xing, WANG Yu-jie, NIE Jiang-long, LIANG Rui-yan, HE Rui-chun. Deep reinforcement learning signal continuous control of intersection based on cellular deduction multi-step decision mechanism[J]. Journal of Traffic and Transportation Engineering, 2025, 25(4): 296-310. doi: 10.19818/j.cnki.1671-1637.2025.04.021

Deep reinforcement learning signal continuous control of intersection based on cellular deduction multi-step decision mechanism

doi: 10.19818/j.cnki.1671-1637.2025.04.021
Funds:

National Natural Science Foundation of China 52162041

Natural Science Foundation of Gansu Province 24JRRA255

Lanzhou Youth Science and Technology Talent Innovation Program 2023-QN-125

More Information
  • Corresponding author: HE Rui-chun(1969-), female, professor, PhD, tranman@163.com
  • Received Date: 2024-10-17
  • Accepted Date: 2025-06-06
  • Rev Recd Date: 2025-05-21
  • Publish Date: 2025-08-28
  • To solve the problem that agents in most current adaptive signal control models based on deep reinforcement learning can only perform discrete control of traffic signals depending on the current state, a deep reinforcement learning-based continuous signal control of intersection was established by introducing a multi-step decision mechanism. The operation and transformation of traffic flow were simulated using a cellular deduction method to realize the intersection state transition. After feature extraction, the state obtained by cellular deduction was concatenated with the current release phase, the vehicle arrival rate, and the departure rate from the previous decision cycle as the state input of the model, improving the accuracy of agent decision-making. The multi-step decision mechanism was used to complete the pre-decision of four phases, which were then integrated and transmitted to the signal light to realize the adaptive signal's continuous control. To verify the applicability of the model, simulation analysis was conducted based on the SUMO platform. Measured intersection traffic flow data were used for comparison with five other models under different scenarios. The results show that under four different traffic scenarios, the optimization effect of the proposed model is equivalent to that of the deep reinforcement learning-based signal control model relying on a discrete grid state. Compared with the deep reinforcement learning-based signal control model relying on feature vector state space, the model reduces average waiting time and fuel consumption by at least 9.80% and 4.56%, respectively. Compared with the traditional Webster timing model, the model reduces average waiting time and fuel consumption by at least 9.30% and 4.67%. These results show that the proposed model achieves continuous traffic signal control with good stability and adaptability, which is of positive significance for promoting the practical application of deep reinforcement learning-based signal control.

     

  • loading
  • [1]
    ZHANG Li-li, WANG Li, ZHANG Ling-yu. Urban road traffic control overview and prospect[J]. Science Technology and Engineering, 2020, 20(16): 6322-6329.
    [2]
    MIKAMI S, KAKAZU Y. Genetic reinforcement learning for cooperative traffic signal control[C]// IEEE. Proceedings of the First IEEE Conference on Evolutionary Computation. New York: IEEE, 2002: 223-228.
    [3]
    LIU Quan, ZHAI Jian-wei, ZHANG Zong-chang, et al. A survey on deep reinforcement learning[J]. Chinese Journal of Computers, 2018, 41(1): 1-27.
    [4]
    LI L, LV Y S, WANG F Y. Traffic signal timing via deep reinforcement learning[J]. IEEE/CAA Journal of Automati-ca Sinica, 2016, 3(3): 247-254. doi: 10.1109/JAS.2016.7508798
    [5]
    DUCROCQ R, FARHI N. Deep reinforcement Q-learning for intelligent traffic signal control with partial detection[J]. International Journal of Intelligent Transportation Systems Research, 2023, 21(1): 192-206. doi: 10.1007/s13177-023-00346-4
    [6]
    YU J J, LAHAROTTE P A, HAN Y, et al. Decentralized signal control for multi-modal traffic network: A deep rein-forcement learning approach[J]. Transportation Research Part C: Emerging Technologies, 2023, 154: 104281. doi: 10.1016/j.trc.2023.104281
    [7]
    YE B L, CHEN D, WU P, et al. A traffic signal control method based on improved deep reinforcement learning[C]// IEEE. 2024 China Automation Congress (CAC). New York: IEEE, 2024: 5959-5964.
    [8]
    MA C L, WANG B, LI Z H, et al. Lyapunov function consistent adaptive network signal control with back pressure and reinforcement learning[J/OL]. arXiv, 2022, http://doi.org/10.48550/arXiv.2210.02612.
    [9]
    KANG L L, HUANG H, LU W K, et al. A dueling deep Q-Network method for low-carbon traffic signal control[J]. Applied Soft Computing, 2023, 141: 110304. doi: 10.1016/j.asoc.2023.110304
    [10]
    LIANG X Y, DU X S, WANG G L, et al. A deep reinfor-cement learning network for traffic light cycle control[J]. IEEE Transactions on Vehicular Technology, 2019, 68(2): 1243-1253. doi: 10.1109/TVT.2018.2890726
    [11]
    CAO K R, WANG L W, ZHANG S, et al. Optimization control of adaptive traffic signal with deep reinforcement learning[J]. Electronics, 2024, 13(1): 198. doi: 10.3390/electronics13010198
    [12]
    RIZZO S G, VANTINI G, CHAWLA S. Time critic policy gradient methods for traffic signal control in complex and congested scenarios[C]//TEREDESAI A, KUMAR V. Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. New York: ACM, 2019: 1654-1664.
    [13]
    MOUSAVI S S, SCHUKAT M, HOWLEY E. Traffic light control using deep policy-gradient and value-function-based reinforcement learning[J]. IET Intelligent Transport Sys-tems, 2017, 11(7): 417-423. doi: 10.1049/iet-its.2017.0153
    [14]
    YANG S T, YANG B, WONG H S, et al. Cooperative tra-ffic signal control using multi-step return and off-policy asynchronous advantage actor-critic graph algorithm[J]. Knowledge-based Systems, 2019, 183: 104855. doi: 10.1016/j.knosys.2019.07.026
    [15]
    ASLANI M, MESGARI M S, SEIPEL S, et al. Developing adaptive traffic signal control by actor-critic and direct ex-ploration methods[J]. Proceedings of the Institution of Civil Engineers—Transport, 2019, 172(5): 289-298. doi: 10.1680/jtran.17.00085
    [16]
    PANG H L, GAO W L. Deep deterministic policy gradient for traffic signal control of single intersection[C]//IEEE. 2019 Chinese Control and Decision Conference (CCDC). New York: IEEE, 2019: 5861-5866.
    [17]
    ADJIE A P, IDHAM ANANTA TIMUR M. Comparison of DQN and DDPG learning algorithm for intelligent traffic signal controller in semarang road network simulation[C]//IEEE. 2023 11th International Conference on Information and Communication Technology (ICoICT). New York: IEEE, 2023: 1-5.
    [18]
    HAN G Y, LIU X H, WANG H, et al. An attention rein-forcement learning-based strategy for large-scale adaptive traffic signal control system[J]. Journal of Transportation Engineering, Part A: Systems, 2024, 150(3): 04024001. doi: 10.1061/JTEPBS.TEENG-8261
    [19]
    LI C H, MA X T, XIA L, et al. Fairness control of traffic light via deep reinforcement learning[C]//IEEE. 2020 IEEE 16th International Conference on Automation Science and En-gineering (CASE). New York: IEEE, 2020: 652-658.
    [20]
    KOCH L, BRINKMANN T, WEGENER M, et al. Adap-tive traffic light control with deep reinforcement learning: An evaluation of traffic flow and energy consumption[J]. IEEE Transactions on Intelligent Transportation Systems, 2023, 24(12): 15066-15076. doi: 10.1109/TITS.2023.3305548
    [21]
    HUANG L B, QU X H. Improving traffic signal control operations using proximal policy optimization[J]. IET Intelli-gent Transport Systems, 2023, 17(3): 592-605. doi: 10.1049/itr2.12286
    [22]
    MAO F, LI Z H, LIN Y L, et al. Mastering arterial traffic signal control with multi-agent attention-based soft actor-cri-tic model[J]. IEEE Transactions on Intelligent Transporta-tion Systems, 2023, 24(3): 3129-3144. doi: 10.1109/TITS.2022.3229477
    [23]
    QIAO Zhi-min, KE Liang-jun. Traffic signal control based on deep reinforcement learning[J]. Control Theory & Applica-tions, 2025, 42(1): 76-86.
    [24]
    FANG S, CHEN F, LIU H C. Dueling double deep Q-net-work for adaptive traffic signal control with low exhaust emissions in a single intersection[J]. IOP Conference Series: Materials Science and Engineering, 2019, 612(5): 052039. doi: 10.1088/1757-899X/612/5/052039
    [25]
    COGGIN J J. Attention mechanism based deep reinforcement learning for traffic signal control[J]. Application Research of Computers, 2023, 40(2): 430-434.
    [26]
    LU Li-ping, CHENG Ken, CHU Duan-feng, et al. Adaptive traffic signal control based on dueling recurrent double Q network[J]. China Journal of Highway and Transport, 2022, 35(8): 267-277.
    [27]
    ZHAO Z N, WANG K, WANG Y, et al. Enhancing traffic signal control with composite deep intelligence[J]. Expert Systems with Applications, 2024, 244: 123020. doi: 10.1016/j.eswa.2023.123020
    [28]
    KUMAR N, RAHMAN S S, DHAKAD N. Fuzzy inference enabled deep reinforcement learning-based traffic light control for intelligent transportation system[J]. IEEE Transactions on Intelligent Transportation Systems, 2021, 22(8): 4919-4928. doi: 10.1109/TITS.2020.2984033
    [29]
    XU M, WU J P, HUANG L, et al. Network-wide traffic signal control based on the discovery of critical nodes and deep reinforcement learning[J]. Journal of Intelligent Transporta-tion Systems, 2020, 24(1): 1-10. doi: 10.1080/15472450.2018.1527694
    [30]
    WAN C H, HWANG M C. Value-based deep reinforcement learning for adaptive isolated intersection signal control[J]. IET Intelligent Transport Systems, 2018, 12(9): 1005-1010. doi: 10.1049/iet-its.2018.5170
    [31]
    TAN K L, SHARMA A, SARKAR S. Robust deep reinfor-cement learning for traffic signal control[J]. Journal of Big Data Analytics in Transportation, 2020, 2(3): 263-274. doi: 10.1007/s42421-020-00029-6
    [32]
    BOUKTIF S, CHENIKI A, OUNI A, et al. Deep reinforce-ment learning for traffic signal control with consistent state and reward design approach[J]. Knowledge-based Systems, 2023, 267: 110440. doi: 10.1016/j.knosys.2023.110440
    [33]
    TOUHBI S, BABRAM M A, NGUYEN-HUU T, et al. Adaptive traffic signal control: Exploring reward definition for reinfor-cement learning[J]. Procedia Computer Science, 2017, 109: 513-520. doi: 10.1016/j.procs.2017.05.327
    [34]
    LI Z N, YU H, ZHANG G H, et al. Network-wide traffic signal control optimization using a multi-agent deep reinfor-cement learning[J]. Transportation Research Part C: Emerg-ing Technologies, 2021, 125: 103059. doi: 10.1016/j.trc.2021.103059
    [35]
    LIU Zhi-min, YE Bao-lin, ZHU Yao-dong, et al. Traffic sig-nal control method based on deep reinforcement learning[J]. Journal of Zhejiang University (Engineering Science), 2022, 56(6): 1249-1256.
    [36]
    LI Shan, REN An-hu, BAI Jing-jing. Research on timing of signal light at countdown intersection based on DQN algori-thm[J]. Foreign Electronic Measurement Technology, 2021, 40(10): 91-97.
    [37]
    DAGANZO C F. The cell transmission model: A dynamic re-presentation of highway traffic consistent with the hydro-dynamic theory[J]. Transportation Research Part B: Metho-dological, 1994, 28(4): 269-287. doi: 10.1016/0191-2615(94)90002-7
    [38]
    DAGANZO C F. The cell transmission model, Part Ⅱ: Net-work traffic[J]. Transportation Research Part B: Metho-dological, 1995, 29(2): 79-93. doi: 10.1016/0191-2615(94)00022-R
    [39]
    YE L H, YAMAMOTO T. Modeling connected and auto-nomous vehicles in heterogeneous traffic flow[J]. Physica A: Statistical Mechanics and Its Applications, 2018, 490: 269-277. doi: 10.1016/j.physa.2017.08.015
    [40]
    MOHEBIFARD R, BIN AL ISLAM S M A, HAJBABAIE A. Cooperative traffic signal and perimeter control in semi-connected urban-street networks[J]. Transportation Resear-ch Part C: Emerging Technologies, 2019, 104: 408-427. doi: 10.1016/j.trc.2019.05.023
    [41]
    BIN AL ISLAM S M A, HAJBABAIE A, ABDUL AZIZ H M. A real-time network-level traffic signal control metho-dology with partial connected vehicle information[J]. Tran-sportation Research Part C: Emerging Technologies, 2020, 121: 102830. doi: 10.1016/j.trc.2020.102830
    [42]
    YU Shao-wei, SHI Zhong-ke. Car-following model on vehi-cles arrival during the red phase[J]. China Journal of High-way and Transport, 2014, 27(11): 93-100.
    [43]
    WANG Fu-jian, FAN Cheng-rui, ZHOU Bin, et al. Traffic signal decentralized reinforcement learning method based on a multi-perspective spatio-temporal hierarchical structure[J]. China Journal of Highway and Transport, 2024, 37(7): 250-263.

Catalog

    Article Metrics

    Article views (130) PDF downloads(11) Cited by()
    Related

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return