XIE Xian-yi, ZHAO Xin, JIN Li-sheng, GUO Bai-cang, LI Ke-qiang. Trajectory tracking control of intelligent vehicles based on deep reinforcement learning and rolling horizon optimization[J]. Journal of Traffic and Transportation Engineering, 2024, 24(6): 259-272. doi: 10.19818/j.cnki.1671-1637.2024.06.018
Trajectory tracking control of intelligent vehicles based on deep reinforcement learning and rolling horizon optimization

    XIE Xian-yi(1989-), male, associate professor, PhD, xiexianyi@ysu.edu.cn

  • Corresponding author: JIN Li-sheng(1975-), male, professor, PhD, jinls@ysu.edu.cn
  • Received Date: 2024-08-29
  • Publish Date: 2024-12-25
  • To improve the generalization of trajectory tracking strategies for intelligent vehicles trained by deep reinforcement learning, a method of trajectory tracking control of intelligent vehicles based on rolling optimization and twin delayed deep deterministic policy gradient (ROTD3) was proposed to address the issue of poor trajectory tracking performance under different speed conditions when the reinforcement learning models were trained at a single speed. The trajectory tracking model was trained by a twin delayed deep deterministic policy gradient (TD3) deep reinforcement learning with a fixed speed tracking double lane change trajectory. Parameters of TD3 model were adjusted to obtain a strategy that satisfied the required trajectory tracking accuracy and achieved rapid convergence. Based on the trained TD3 model and the idea of model predictive control (MPC), a framework integrating ROTD3 was constructed. In the prediction horizon, the front-wheel steering angle output by the TD3 model was used for prediction. The lateral deviation and heading deviation in the course of trajectory tracking were optimized in the rolling horizon, and the control horizon sequence in the form of front-wheel steering angle increment of vehicles was solved by the quadratic programming method. The first control increment in the control horizon was added to the TD3 control quantity. This sum was used as the front-wheel steering angle control ourput. The ROTD3, TD3, and MPC were compared through the simulation experiments for trajectory tracking. Research results show that the ROTD3 achieves higher trajectory tracking accuracy. During the double lane change trajectory tracking at a longitudinal speed of 20 m·s-1, the mean absolute lateral deviation of ROTD3 reduces by 83.52% compared with TD3, and reduces 91.02% compared with MPC. When tracking a snake-like trajectory, the results of ROTD3 are consistent with those of double lane change simulation. When the front-wheel steering angle output by the TD3 model results in large tracking deviations, the front-wheel steering angle increment obtained through the rolling horizon optimization effectively compensates for these deviations. The ROTD3 framework significantly improves the vehicle trajectory tracking performance under various conditions and effectively enhances the generalization and applicability of TD3 reinforcement learning trajectory tracking strategy.


