Volume 24 Issue 6
Dec.  2024
Turn off MathJax
Article Contents
XIE Xian-yi, ZHAO Xin, JIN Li-sheng, GUO Bai-cang, LI Ke-qiang. Trajectory tracking control of intelligent vehicles based on deep reinforcement learning and rolling horizon optimization[J]. Journal of Traffic and Transportation Engineering, 2024, 24(6): 259-272. doi: 10.19818/j.cnki.1671-1637.2024.06.018
Citation: XIE Xian-yi, ZHAO Xin, JIN Li-sheng, GUO Bai-cang, LI Ke-qiang. Trajectory tracking control of intelligent vehicles based on deep reinforcement learning and rolling horizon optimization[J]. Journal of Traffic and Transportation Engineering, 2024, 24(6): 259-272. doi: 10.19818/j.cnki.1671-1637.2024.06.018

Trajectory tracking control of intelligent vehicles based on deep reinforcement learning and rolling horizon optimization

doi: 10.19818/j.cnki.1671-1637.2024.06.018
Funds:

National Natural Science Foundation of China 52072333

National Natural Science Foundation of China 52202503

Open Fund Project of State Key Laboratory of Automotive Safety and Energy Conservation KFY2211

Natural Science Foundation of Hebei Province F2021203107

Natural Science Foundation of Hebei Province F2022203054

National Key Research and Development Program of China 2022YFF0604901

More Information
  • Author Bio:

    XIE Xian-yi(1989-), male, associate professor, PhD, xiexianyi@ysu.edu.cn

  • Corresponding author: JIN Li-sheng(1975-), male, professor, PhD, jinls@ysu.edu.cn
  • Received Date: 2024-08-29
  • Publish Date: 2024-12-25
  • To improve the generalization of trajectory tracking strategies for intelligent vehicles trained by deep reinforcement learning, a method of trajectory tracking control of intelligent vehicles based on rolling optimization and twin delayed deep deterministic policy gradient (ROTD3) was proposed to address the issue of poor trajectory tracking performance under different speed conditions when the reinforcement learning models were trained at a single speed. The trajectory tracking model was trained by a twin delayed deep deterministic policy gradient (TD3) deep reinforcement learning with a fixed speed tracking double lane change trajectory. Parameters of TD3 model were adjusted to obtain a strategy that satisfied the required trajectory tracking accuracy and achieved rapid convergence. Based on the trained TD3 model and the idea of model predictive control (MPC), a framework integrating ROTD3 was constructed. In the prediction horizon, the front-wheel steering angle output by the TD3 model was used for prediction. The lateral deviation and heading deviation in the course of trajectory tracking were optimized in the rolling horizon, and the control horizon sequence in the form of front-wheel steering angle increment of vehicles was solved by the quadratic programming method. The first control increment in the control horizon was added to the TD3 control quantity. This sum was used as the front-wheel steering angle control ourput. The ROTD3, TD3, and MPC were compared through the simulation experiments for trajectory tracking. Research results show that the ROTD3 achieves higher trajectory tracking accuracy. During the double lane change trajectory tracking at a longitudinal speed of 20 m·s-1, the mean absolute lateral deviation of ROTD3 reduces by 83.52% compared with TD3, and reduces 91.02% compared with MPC. When tracking a snake-like trajectory, the results of ROTD3 are consistent with those of double lane change simulation. When the front-wheel steering angle output by the TD3 model results in large tracking deviations, the front-wheel steering angle increment obtained through the rolling horizon optimization effectively compensates for these deviations. The ROTD3 framework significantly improves the vehicle trajectory tracking performance under various conditions and effectively enhances the generalization and applicability of TD3 reinforcement learning trajectory tracking strategy.

     

  • loading
  • [1]
    上官伟, 李鑫, 柴琳果, 等. 车路协同环境下混合交通群体智能仿真与测试研究综述[J]. 交通运输工程学报, 2022, 22(3): 19-40. doi: 10.19818/j.cnki.1671-1637.2022.03.002

    SHANGGUAN Wei, LI Xin, CHAI Lin-guo, et al. Research review on simulation and test of mixed traffic swarm in vehicle-infrastructure cooperative environment[J]. Journal of Traffic and Transportation Engineering, 2022, 22(3): 19-40. (in Chinese) doi: 10.19818/j.cnki.1671-1637.2022.03.002
    [2]
    WANG Hong, HUANG Yan-jun, KHAJEPOUR A, et al. Ethical decision-making platform in autonomous vehicles with lexicographic optimization based model predictive controller[J]. IEEE Transactions on Vehicular Technology, 2020, 69(8): 8164-8175. doi: 10.1109/TVT.2020.2996954
    [3]
    李克强, 陈涛, 罗禹贡, 等. 智能环境友好型车辆——概念、体系结构及工程实现[J]. 汽车工程, 2010, 32(9): 743-748, 762.

    LI Ke-qiang, CHEN Tao, LUO Yu-gong, et al. Environmentally friendly intelligent vehicle: concept architecture and implementation[J]. Automotive Engineering, 2010, 32(9): 743-748, 762. (in Chinese)
    [4]
    蔡英凤, 秦顺琪, 臧勇, 等. 基于可拓优度评价的智能汽车横向轨迹跟踪控制方法[J]. 汽车工程, 2019, 41(10): 1189-1196.

    CAI Ying-feng, QIN Shun-qi, ZANG Yong, et al. Lateral trajectory tracking control scheme for intelligent vehicle based on extension goodness evaluation[J]. Automotive Engineering, 2019, 41(10): 1189-1196. (in Chinese)
    [5]
    张新荣, 康龙, 唐家朋, 等. 基于变论域模糊多参数自整定PID控制的智能挖掘机轨迹跟踪[J]. 中国公路学报, 2023, 36(2): 240-250. doi: 10.3969/j.issn.1001-7372.2023.02.020

    ZHANG Xin-rong, KANG Long, TANG Jia-peng, et al. Trajectory tracking of intelligent excavator using variable universe fuzzy multi-parameter self-tuning PID control[J]. China Journal of Highway and Transport, 2023, 36(2): 240-250. (in Chinese) doi: 10.3969/j.issn.1001-7372.2023.02.020
    [6]
    LONG Jia-teng, ZHU Sheng-ying, CUI Ping-yuan, et al. Barrier Lyapunov function based sliding mode control for Mars atmospheric entry trajectory tracking with input saturation constraint[J]. Aerospace Science and Technology, 2020, 106: 106213. doi: 10.1016/j.ast.2020.106213
    [7]
    赵树恩, 冷姚, 邵毅明. 车辆多目标自适应巡航显式模型预测控制[J]. 交通运输工程学报, 2020, 20(3): 206-216. doi: 10.19818/j.cnki.1671-1637.2020.03.019

    ZHAO Shu-en, LENG Yao, SHAO Yi-ming. Explicit model predictive control of multi-objective adaptive cruise of vehicle[J]. Journal of Traffic and Transportation Engineering, 2020, 20(3): 206-216. (in Chinese) doi: 10.19818/j.cnki.1671-1637.2020.03.019
    [8]
    CHEN Xin-bo, BAO Qi-lin, ZHANG Bang. Research on 4WIS electric vehicle path tracking control based on adaptive fuzzy PID algorithm[C]//IEEE. 2019 Chinese Control Conference (CCC). New York: IEEE, 2019: 6753-6760.
    [9]
    ONIEVA E, NARANJO J E, MILANÉS V, et al. Automatic lateral control for unmanned vehicles via genetic algorithms[J]. Applied Soft Computing, 2011, 11(1): 1303-1309. doi: 10.1016/j.asoc.2010.04.003
    [10]
    GOSWAMI N K, PADHY P K. Sliding mode controller design for trajectory tracking of a non-holonomic mobile robot with disturbance[J]. Computers and Electrical Engineering, 2018, 72: 307-323. doi: 10.1016/j.compeleceng.2018.09.021
    [11]
    ZHANG Kun-wu, SUN Qi, SHI Yang. Trajectory tracking control of autonomous ground vehicles using adaptive learning MPC[J]. IEEE Transactions on Neural Networks and Learning Systems, 2021, 32(12): 5554-5564. doi: 10.1109/TNNLS.2020.3048305
    [12]
    潘世举, 李华, 苏致远, 等. 基于跟踪误差模型的智能车辆轨迹跟踪方法[J]. 汽车工程, 2019, 41(9): 1021-1027.

    PAN Shi-ju, LI Hua, SU Zhi-yuan, et al. Trajectory tracking method for intelligent vehicles based on tracking-error model[J]. Automotive Engineering, 2019, 41(9): 1021-1027. (in Chinese)
    [13]
    LI Jun-xiang, YAO Liang, XU Xin, et al. Deep reinforcement learning for pedestrian collision avoidance and human-machine cooperative driving[J]. Information Sciences, 2020, 532: 110-124. doi: 10.1016/j.ins.2020.03.105
    [14]
    DA SILVA F L, COSTA A H R. A survey on transfer learning for multiagent reinforcement learning systems[J]. Journal of Artificial Intelligence Research, 2019, 64: 645-703. doi: 10.1613/jair.1.11396
    [15]
    LI Dong, ZHAO Dong-bin, ZHANG Qi-chao, et al. Reinforcement learning and deep learning based lateral control for autonomous driving[J]. IEEE Computational Intelligence Magazine, 2019, 14(2): 83-98. doi: 10.1109/MCI.2019.2901089
    [16]
    FUCHS F, SONG Y L, KAUFMANN E, et al. Super-human performance in gran turismo sport using deep reinforcement learning[J]. IEEE Robotics and Automation Letters, 2021, 6(3): 4257-4264. doi: 10.1109/LRA.2021.3064284
    [17]
    林歆悠, 叶卓明, 周斌豪. 基于DQN强化学习的自动驾驶转向控制策略[J]. 机械工程学报, 2023, 59(16): 315-324.

    LIN Xin-you, YE Zhuo-ming, ZHOU Bin-hao. DQN reinforcement learning-based steering control strategy for autonomous driving[J]. Journal of Mechanical Engineering, 2023, 59(16): 315-324. (in Chinese)
    [18]
    郑川, 杜煜, 刘子健. 基于模糊收敛和模仿强化学习的自动驾驶横向控制方法[J]. 汽车技术, 2024(7): 29-36.

    ZHENG Chuan, DU Yu, LIU Zi-jian. A lateral control method of autonomous driving based on fuzzy convergence and imitative reinforcement learning[J]. Automotive Technology, 2024(7): 29-36. (in Chinese)
    [19]
    ZHOU Quan, ZHAO De-zong, SHUAI Bin, et al. Knowledge implementation and transfer with an adaptive learning network for real-time power management of the plug-in hybrid vehicle[J]. IEEE Transactions on Neural Networks and Learning Systems, 2021, 32(12): 5298-5308. doi: 10.1109/TNNLS.2021.3093429
    [20]
    焦龙飞, 谷志茹, 舒小华, 等. 自动驾驶路径优化的RF-DDPG车辆控制算法研究[J]. 湖南工业大学学报, 2024, 38(1): 62-69. doi: 10.3969/j.issn.1673-9833.2024.01.009

    JIAO Long-fei, GU Zhi-ru, SHU Xiao-hua, et al. Research on RF-DDPG vehicle control algorithm for autonomous driving path optimization[J]. Journal of Hunan University of Technology, 2024, 38(1): 62-69. (in Chinese) doi: 10.3969/j.issn.1673-9833.2024.01.009
    [21]
    李新凯, 虎晓诚, 马萍, 等. 基于改进DDPG的无人驾驶避障跟踪控制[J]. 华南理工大学学报(自然科学版), 2023, 51(11): 44-55. doi: 10.12141/j.issn.1000-565X.220747

    LI Xin-kai, HU Xiao-cheng, MA Ping, et al. Driverless obstacle avoidance and tracking control based on improved DDPG[J]. Journal of South China University of Technology (Natural Science Edition), 2023, 51(11): 44-55. (in Chinese) doi: 10.12141/j.issn.1000-565X.220747
    [22]
    赖金萍, 李浩, 石英, 等. 基于DDPG算法的无人车辆防碰撞控制策略[J]. 武汉理工大学学报, 2021, 43(10): 68-76.

    LAI Jin-ping, LI Hao, SHI Ying, et al. Anti collision control strategy of unmanned vehicle based on DDPG algorithm[J]. Journal of Wuhan University of Technology, 2021, 43(10): 68-76. (in Chinese)
    [23]
    李文礼, 邱凡珂, 廖达明, 等. 基于深度强化学习的高速公路换道跟踪控制模型[J]. 汽车安全与节能学报, 2022, 13(4): 750-759. doi: 10.3969/j.issn.1674-8484.2022.04.016

    LI Wen-li, QIU Fan-ke, LIAO Da-ming, et al. Highway lane change decision control model based on deep reinforcement learning[J]. Journal of Automotive Safety and Energy, 2022, 13(4): 750-759. (in Chinese) doi: 10.3969/j.issn.1674-8484.2022.04.016
    [24]
    贺伊琳, 宋若旸, 马建. 基于强化学习DDPG的智能车辆轨迹跟踪控制[J]. 中国公路学报, 2021, 34(11): 335-348. doi: 10.3969/j.issn.1001-7372.2021.11.026

    HE Yi-lin, SONG Ruo-yang, MA Jian. Trajectory tracking control of intelligent vehicle based on DDPG method of reinforcement learning[J]. China Journal of Highway and Transport, 2021, 34(11): 335-348. (in Chinese) doi: 10.3969/j.issn.1001-7372.2021.11.026
    [25]
    SRIKONDA S, NORRIS W R, NOTTAGE D, et al. Deep reinforcement learning for autonomous dynamic skid steer vehicle trajectory tracking[J]. Robotics, 2022, 11(5): 11050095.
    [26]
    HESSEL M, MODAYIL J, VAN HASSELT H, et al. Rainbow: combining improvements in deep reinforcement learning[J]. Proceedings of the AAAI Conference on Artificial Intelligence, 2018, 32(1): 3215-3222. http://www.cs.ucl.ac.uk/staff/d.silver/web/Applications_files/rainbow.pdf
    [27]
    张炳力, 佘亚飞. 基于深度强化学习的轨迹跟踪横向控制研究[J]. 合肥工业大学学报(自然科学版), 2023, 46(7): 865-872. doi: 10.3969/j.issn.1003-5060.2023.07.001

    ZHANG Bing-li, SHE Ya-fei. Research on lateral control of trajectory tracking based on deep reinforcement learning[J]. Journal of Hefei University of Technology (Natural Science), 2023, 46(7): 865-872. (in Chinese) doi: 10.3969/j.issn.1003-5060.2023.07.001
    [28]
    汪洪波, 王春阳, 赵林峰, 等. 基于强化学习的智能车辆路径跟踪变参数MPC多目标控制[J]. 中国公路学报, 2024, 37(3): 157-169.

    WANG Hong-bo, WANG Chun-yang, ZHAO Lin-feng, et al. Variable-parameter MPC multi-objective control for intelligent vehicle path tracking based on reinforcement learning[J]. China Journal of Highway and Transport, 2024, 37(3): 157-169. (in Chinese)
    [29]
    FUJIMOTO S, VAN HOOF H, MEGER D. Addressing function approximation error in actor-critic methods[C]//PMLR. Proceedings of the 35th International Conference on Machine Learning. Stockholm: PMLR, 2018: 1587-1596.
    [30]
    JI Xue-wu, LIU Yu-long, NA Xiao-xiang, et al. Research on interactive steering control strategy between driver and AFS in different game equilibrium strategies and information patterns[J]. Vehicle System Dynamics, 2018, 56(9): 1344-1374. doi: 10.1080/00423114.2018.1435890
  • 加载中

Catalog

    通讯作者: 陈斌, bchen63@163.com
    • 1. 

      沈阳化工大学材料科学与工程学院 沈阳 110142

    1. 本站搜索
    2. 百度学术搜索
    3. 万方数据库搜索
    4. CNKI搜索

    Article Metrics

    Article views (82) PDF downloads(11) Cited by()
    Proportional views
    Related

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return