留言板

尊敬的读者、作者、审稿人, 关于本刊的投稿、审稿、编辑和出版的任何问题, 您可以本页添加留言。我们将尽快给您答复。谢谢您的支持!

姓名
邮箱
手机号码
标题
留言内容
验证码

基于强化学习的智能车人机共融转向驾驶决策方法

吴超仲 冷姚 陈志军 罗鹏

吴超仲, 冷姚, 陈志军, 罗鹏. 基于强化学习的智能车人机共融转向驾驶决策方法[J]. 交通运输工程学报, 2022, 22(3): 55-67. doi: 10.19818/j.cnki.1671-1637.2022.03.004
引用本文: 吴超仲, 冷姚, 陈志军, 罗鹏. 基于强化学习的智能车人机共融转向驾驶决策方法[J]. 交通运输工程学报, 2022, 22(3): 55-67. doi: 10.19818/j.cnki.1671-1637.2022.03.004
WU Chao-zhong, LENG Yao, CHEN Zhi-jun, LUO Peng. Human-machine integration method for steering decision-making of intelligent vehicle based on reinforcement learning[J]. Journal of Traffic and Transportation Engineering, 2022, 22(3): 55-67. doi: 10.19818/j.cnki.1671-1637.2022.03.004
Citation: WU Chao-zhong, LENG Yao, CHEN Zhi-jun, LUO Peng. Human-machine integration method for steering decision-making of intelligent vehicle based on reinforcement learning[J]. Journal of Traffic and Transportation Engineering, 2022, 22(3): 55-67. doi: 10.19818/j.cnki.1671-1637.2022.03.004

基于强化学习的智能车人机共融转向驾驶决策方法

doi: 10.19818/j.cnki.1671-1637.2022.03.004
基金项目: 

国家自然科学基金项目 52172394

国家重点研发计划 2018YFB1600600

湖北省科技重大专项 2020AAA001

详细信息
    作者简介:

    吴超仲(1972-),男,湖北天门人,武汉理工大学教授,工学博士,从事交通安全与人机共融驾驶研究

    通讯作者:

    陈志军(1983-),男,河南周口人,武汉理工大学副研究员,工学博士

  • 中图分类号: U461.9

Human-machine integration method for steering decision-making of intelligent vehicle based on reinforcement learning

Funds: 

National Natural Science Foundation of China 52172394

National Key Research and Development Pragram of China 2018YFB1600600

Major Science and Technology Project in Hubei Province 2020AAA001

More Information
Article Text (Baidu Translation)
  • 摘要: 针对智能车人机共融驾驶系统中人和自主驾驶系统的驾驶权连续动态分配问题,尤其是因建模误差导致的权重分配方法适应性低的难题,提出了基于强化学习的人机共融转向驾驶决策方法;考虑驾驶人的转向特性,搭建了基于双点预瞄的驾驶人模型,并采用预测控制理论建立了智能车自主转向控制模型,构建了智能车人机同时在环的转向控制框架;基于Actor-Critic强化学习架构,设计了用于人机驾驶权分配的深度确定性策略梯度(DDPG)智能体,以曲率契合度、跟踪精确性和乘坐舒适性为目标,提出了基于模型的收益函数;构建了人机共融驾驶权分配强化学习框架,包含驾驶人模型、自主转向模型、驾驶权分配智能体以及收益函数;为了验证方法的有效性,招募了8位驾驶人开展共计48人次的模拟驾驶试验。研究结果表明:在曲率适应性验证中,人机共融-DDPG方法优于人工驾驶和人机共融-Fuzzy方法,跟踪性平均提升70.69%、39.67%,舒适性平均提升18.34%、7.55%;在速度适应性验证中,车速为40、60和80 km·h-1条件下,驾驶人权重大于0.5的时间占比分别为90.00%、85.76%、60.74%,且跟踪性相轨迹和舒适性相轨迹都能有效收敛。可见,提出的方法能够适应曲率和车速变化,在保证安全性的前提下提升了跟踪性和舒适性。

     

  • 图  1  驾驶人转向控制原理

    Figure  1.  Driver steering control principle

    图  2  驾驶人模型的组成

    Figure  2.  Composition of driver model

    图  3  跟踪误差模型

    Figure  3.  Tracking error model

    图  4  人机共融转向架构

    Figure  4.  Framework of HMI steering

    图  5  人机共融驾驶强化学习架构

    Figure  5.  Reinforcement learning framework of HMI driving

    图  6  DDPG智能体强化学习结果

    Figure  6.  Results of DDPG agent reinforcement learning

    图  7  人机共融驾驶试验平台

    Figure  7.  HMI driving experimental platform

    图  8  工况1

    Figure  8.  Working condition 1

    图  9  工况2

    Figure  9.  Working condition 2

    图  10  工况1箱线图

    Figure  10.  Box plots in working condition 1

    图  11  工况1-驾驶人1的详细数据

    Figure  11.  Detailed data of driver 1 in working condition 1

    图  12  工况1的指标降低率对比

    Figure  12.  Comparison of index reduction rates in working condition 1

    图  13  工况2箱线图

    Figure  13.  Box plots in working condition 2

    图  14  工况2中驾驶人1的详细数据

    Figure  14.  Detailed data of driver 1 in working condition 2

    1.  Driver steering control principle

    2.  Composition of driver model

    3.  Tracking error model

    4.  Framework of HMI steering

    5.  Reinforcement framework of HMI driving

    6.  Results of DDPG agent reinforcement learning

    7.  HMI driving experimental platform

    8.  Working condition 1

    9.  Working condition 2

    10.  Box plots in working condition 1

    11.  Detailed data of driver 1 in working condition 1

    12.  Comparison of index reduction rates in working condition 1

    13.  Box plots in working condition 2

    14.  Detailed data of driver 1 in working condition 2

    表  1  收益函数参数

    Table  1.   Gain function parameters

    参数 取值 取值依据
    τ1 1/3 转向角(°)均值的倒数
    τ2 1 侧向加速度(m·s-2)均值的倒数
    τ3 10 质心侧偏角(°)均值的倒数
    τ4 5 位置误差(m)均值的倒数
    τ5 2 航向角误差(°)均值的倒数
    σ1σ2σ3 -1、-1、-1 平均权重
    ρ1ρ2ρ3 1、1、10
    下载: 导出CSV

    表  2  DDPG算法参数

    Table  2.   DDPG algorithm parameters

    参数 取值
    采样步长/s 0.1
    单次训练时间/s 60
    Critic学习率 5.0×10-4
    Actor学习率 1.0×10-3
    平滑因子 1.0×10-3
    经验采样数 64
    下载: 导出CSV

    表  3  工况1中本文方法的优势

    Table  3.   Advantages of proposed method in working condition 1 %

    参数 对比方法 驾驶人 均值
    1 2 3 4 5 6 7 8
    e1max 人工驾驶 67.89 80.95 77.73 85.34 81.21 74.01 73.56 77.93 77.33
    人机共融-Fuzzy 36.27 32.02 34.63 38.39 39.11 47.77 8.73 22.01 32.37
    e2max 人工驾驶 60.40 77.40 70.80 75.06 63.11 27.61 63.18 74.86 64.05
    人机共融-Fuzzy 29.57 62.77 56.40 61.60 51.49 41.37 51.12 21.41 46.97
    amax 人工驾驶 16.33 31.52 29.69 7.44 19.46 22.29 15.71 27.59 21.25
    人机共融-Fuzzy 8.18 15.72 5.02 16.39 15.83 14.53 -0.55 5.11 10.03
    βmax 人工驾驶 12.47 19.22 21.75 8.91 16.00 14.36 13.21 17.49 15.43
    人机共融-Fuzzy 5.31 3.91 1.71 8.95 8.68 7.07 1.97 2.99 5.07
    下载: 导出CSV

    表  4  工况1的驾驶人1指标对比

    Table  4.   Indicator comparison of driver 1 in working condition 1

    对比方法 e1max /m e2max/(°) βmax/(°) amax/(m·s-2) Δδ0/[(°)·s-1] Δa0/(m·s-3)
    人工驾驶 0.875 2.88 0.339 3.53 5.42 0.482
    人机共融-Fuzzy 0.441 1.62 0.313 3.22 4.76 0.356
    人机共融-DDPG 0.281 1.14 0.296 2.95 3.31 0.272
    下载: 导出CSV

    表  5  工况2中驾驶人权重大于0.5的时间占比

    Table  5.   Time ratios in condition 2 when driver's weight is greater than 0.5

    车速/ (km·h-1) 不同驾驶人的时间占比/% 均值/%
    1 2 3 4 5 6 7 8
    40 91.45 89.76 87.00 87.76 91.67 92.49 89.98 89.89 90.00
    60 90.34 66.54 83.41 89.97 91.74 85.37 83.94 94.80 85.76
    80 71.92 53.98 55.09 67.53 65.53 51.00 47.62 73.21 60.74
    下载: 导出CSV

    1.   Gain function parameters

    2.   DDPG algorithm parameters

    3.   Advantages of proposed method in working condition 1, %

    4.   Indicator comparison of driver 1 in working condition 1

    5.   Time proportions in condition 2 when driver’s weight is greater than 0.5

  • [1] HARBM, STATHOPOULOS A, SHIFTAN Y, et al. What do we (not) know about our future with automated vehicles?[J]. Transportation Research Part C: Emerging Technologies, 2021, 123: 102948. doi: 10.1016/j.trc.2020.102948
    [2] 姚荣涵, 祁文彦, 郭伟伟. 自动驾驶环境下驾驶人接管行为结构方程模型[J]. 交通运输工程学报, 2021, 21(2): 209-221. doi: 10.19818/j.cnki.1671-1637.2021.02.018

    YAO Rong-han, QI Wen-yan, GUO Wei-wei. Structural equation model of drivers' takeover behaviors in autonomous driving environment[J]. Journal of Traffic and Transportation Engineering, 2021, 21(2): 209-221. (in Chinese) doi: 10.19818/j.cnki.1671-1637.2021.02.018
    [3] 胡云峰, 曲婷, 刘俊, 等. 智能汽车人机协同控制的研究现状与展望[J]. 自动化学报, 2019, 45(7): 1261-1280. https://www.cnki.com.cn/Article/CJFDTOTAL-MOTO201907004.htm

    HU Yun-feng, QU Ting, LIU Jun, et al. Human-machine cooperative control of intelligent vehicle: recent developments and future perspectives[J]. Acta Automatica Sinica, 2019, 45(7): 1261-1280. (in Chinese) https://www.cnki.com.cn/Article/CJFDTOTAL-MOTO201907004.htm
    [4] WANG Wen-shuo, NA Xiao-xiang, CAO Dong-pu, et al. Decision-making in driver-automation shared control: a review and perspectives[J]. IEEE/CAA Journal of Automatica Sinica, 2020, 7(5): 1289-1307.
    [5] 宗长富, 代昌华, 张东. 智能汽车的人机共驾技术研究现状和发展趋势[J]. 中国公路学报, 2021, 34(6): 214-237. doi: 10.3969/j.issn.1001-7372.2021.06.021

    ZONG Chang-fu, DAI Chang-hua, ZHANG Dong. Human-machine interaction technology of intelligent vehicles: current development trends and future directions[J]. China Journal of Highway and Transport, 2021, 34(6): 214-237. (in Chinese) doi: 10.3969/j.issn.1001-7372.2021.06.021
    [6] ERLIEN S M, FUJITA S, GERDES J C. Shared steering control using safe envelopes for obstacle avoidance and vehicle stability[J]. IEEE Transactions on Intelligent Transportation Systems, 2016, 17(2): 441-451. doi: 10.1109/TITS.2015.2453404
    [7] SONG L, GUO H, WANG F, et al. Model predictive control oriented shared steering control for intelligent vehicles[C]//IEEE. 29th Chinese Control and Decision Conference (CCDC). New York: IEEE, 2017: 7568-7573.
    [8] LYU Chen, CAO Dong-pu, ZHAO Yi-fan, et al. Analysis of autopilot disengagements occurring during autonomous vehicle testing[J]. IEEE/CAA Journal of Automatica Sinica, 2018, 5(1): 58-68. doi: 10.1109/JAS.2017.7510745
    [9] 吴超仲, 吴浩然, 吕能超. 人机共驾智能汽车的控制权切换与安全性综述[J]. 交通运输工程学报, 2018, 18(6): 131-141. doi: 10.3969/j.issn.1671-1637.2018.06.014

    WU Chao-zhong, WU Hao-ran, LYU Neng-chao. Review of control switch and safety of human-computer driving intelligent vehicle[J]. Journal of Traffic and Transportation Engineering, 2018, 18(6): 131-141. (in Chinese) doi: 10.3969/j.issn.1671-1637.2018.06.014
    [10] 郭烈, 马跃, 岳明, 等. 驾驶特性的识别评估及其在智能汽车上的应用综述[J]. 交通运输工程学报, 2021, 21(2): 7-20. doi: 10.19818/j.cnki.1671-1637.2021.02.002

    GUO Lie, MA Yue, YUE Ming, et al. Overview of recognition and evaluation of driving characteristics and their applications in intelligent vehicles[J]. Journal of Traffic and Transportation Engineering, 2021, 21(2): 7-20. (in Chinese) doi: 10.19818/j.cnki.1671-1637.2021.02.002
    [11] JIN M, LU G, CHEN F, et al. Modeling takeover behavior in level 3 automated driving via a structural equation model: considering the mediating role of trust[J]. Accident Analysis and Prevention, 2021, 157: 106156. doi: 10.1016/j.aap.2021.106156
    [12] 何仁, 赵晓聪, 杨奕彬, 等. 基于驾驶人风险响应机制的人机共驾模型[J]. 吉林大学学报(工学版), 2021, 51(3): 799-809. https://www.cnki.com.cn/Article/CJFDTOTAL-JLGY202103003.htm

    HE Ren, ZHAO Xiao-cong, YANG Yi-bin, et al. Man-machine shared driving model using risk-response mechanism of human driver[J]. Journal of Jilin University (Engineering and Technology Edition), 2021, 51(3): 799-809. (in Chinese) https://www.cnki.com.cn/Article/CJFDTOTAL-JLGY202103003.htm
    [13] MARCANO M, DÍAZ S, PÉREZ J, et al. A review of shared control for automated vehicles: theory and applications[J]. IEEE Transactions on Human-Machine Systems, 2020, 50(6): 475-491. doi: 10.1109/THMS.2020.3017748
    [14] NGUYEN A T, SENTOUH C, POPIEUL J C. Driver-automation cooperative approach for shared steering control under multiple system constraints: design and experiments[J]. IEEE Transactions on Industrial Electronics, 2017, 64(5): 3819-3830. doi: 10.1109/TIE.2016.2645146
    [15] SENTOUH C, NGUYEN A T, BENLOUCIF M A, et al. Driver-automation cooperation oriented approach for shared control of lane keeping assist systems[J]. IEEE Transactions on Control Systems Technology, 2019, 27(5): 1962-1978. doi: 10.1109/TCST.2018.2842211
    [16] WANG Wen-shuo, XI Jun-qiang, LIU Chang, et al. Human-centered feed-forward control of a vehicle steering system based on a driver's path-following characteristics[J]. IEEE Transactions on Intelligent Transportation Systems, 2017, 18(6): 1440-1453.
    [17] 郭烈, 葛平淑, 夏文旭, 等. 基于人机共驾的车道保持辅助控制系统研究[J]. 中国公路学报, 2019, 32(12): 46-57. https://www.cnki.com.cn/Article/CJFDTOTAL-ZGGL201912006.htm

    GUO Lie, GE Ping-shu, XIA Wen-xu, et al. Lane-keeping control systems based on human-machine cooperative driving[J]. China Journal of Highway and Transport, 2019, 32(12): 46-57. (in Chinese) https://www.cnki.com.cn/Article/CJFDTOTAL-ZGGL201912006.htm
    [18] LUO Rui-kun, WENG Yi-fan, WANG Yi-fan, et al. A workload adaptive haptic shared control scheme for semi-autonomous driving[J]. Accident Analysis and Prevention, 2021, 152: 105968. doi: 10.1016/j.aap.2020.105968
    [19] BENCLOUCIF A, NGUYEN A T, SENTOUH C, et al. Cooperative trajectory planning for haptic shared control between driver and automation in highway driving[J]. IEEE Transactions on Industrial Electronics, 2019, 66(12): 9846-9857. doi: 10.1109/TIE.2019.2893864
    [20] GUO C, SENTOUH C, POPIEUL J C, et al. Shared control framework applied for vehicle longitudinal control in highway merging scenarios[C]//IEEE. 2015 IEEE International Conference on Systems, Man, and Cybernetics. New York: IEEE, 2015: 3098-3103.
    [21] GHASEMI A H, JAYAKUMAR P, GILLESPIE R B. Shared control architectures for vehicle steering[J]. Cognition Technology and Work, 2019, 21(4): 699-709. doi: 10.1007/s10111-019-00560-9
    [22] ZWAAN H M, PETERMEIJER S M, ABBINK D A. Haptic shared steering control with an adaptive level of authority based on time-to-line crossing[J]. IFAC PapersOnLine, 2019, 52(19): 49-54. doi: 10.1016/j.ifacol.2019.12.085
    [23] 陈无畏, 王其东, 丁雨康, 等. 基于预期偏移距离的人机权值分配策略研究[J]. 汽车工程, 2020, 42(4): 513-521. https://www.cnki.com.cn/Article/CJFDTOTAL-QCGC202004015.htm

    CHEN Wu-wei, WANG Qi-dong, DING Yu-kang, et al. Weight allocation strategy between human and machine based on the preview distance to lane center[J]. Automotive Engineering, 2020, 42(4): 513-521. (in Chinese) https://www.cnki.com.cn/Article/CJFDTOTAL-QCGC202004015.htm
    [24] LIANG Huang-huang, YANG Lu, CHENG Hong, et al. Human-in-the-loop reinforcement learning[C]//IEEE. 2017 Chinese Automation Congress (CAC). New York: IEEE, 2017: 4511-4518.
    [25] LI Jun-xiang, YAO Liang, XU Xin, et al. Deep reinforcement learning for pedestrian collision avoidance and human-machine cooperative driving[J]. Information Sciences, 2020, 532: 110-124.
    [26] 郭柏苍, 王胤霖, 谢宪毅, 等. 基于人-车风险状态的人机共驾控制权决策方法[J]. 中国公路学报, 2022, 35(3): 153-165. https://www.cnki.com.cn/Article/CJFDTOTAL-ZGGL202203013.htm

    GUO Bo-cang, WANG Yin-lin, XIE Xian-yi, et al. Decision making method for control right transition of human-machine shared driving based on driver-vehicle risk state[J]. China Journal of Highway and Transport, 2022, 35(3): 153-165. (in Chinese) https://www.cnki.com.cn/Article/CJFDTOTAL-ZGGL202203013.htm
    [27] 田彦涛, 赵彦博, 谢波. 基于驾驶员转向模型的共享控制系统[J]. 自动化学报, 2022, 48(7): 1664-1677. https://www.cnki.com.cn/Article/CJFDTOTAL-MOTO202207003.htm

    TIAN Yan-tao, ZHAO Yan-bo, XIE Bo. Shared control system based on driver steering model[J]. Acta Automatica Sinica, 2022, 48(7): 1664-1677. (in Chinese) https://www.cnki.com.cn/Article/CJFDTOTAL-MOTO202207003.htm
    [28] SALEH L, CHEVREL P, MARS F, et al. Human-like cybernetic driver model for lane keeping[C]//IFAC. Proceedings of the 18th World Congress. Milano: IFAC, 2011: 4368-4373.
    [29] 冷姚, 赵树恩. 智能车辆横向轨迹跟踪的显式模型预测控制方法[J]. 系统仿真学报, 2021, 33(5): 1177-1187. https://www.cnki.com.cn/Article/CJFDTOTAL-XTFZ202105020.htm

    LENG Yao, ZHAO Shu-en. Explicit model predictive control for intelligent vehicle lateral trajectory tracking[J]. Journal of System Simulation, 2021, 33(5): 1177-1187. (in Chinese) https://www.cnki.com.cn/Article/CJFDTOTAL-XTFZ202105020.htm
    [30] LILLICRAP T P, HUNT J J, PRITZEL A, et al. Continuous control with deep reinforcement learning[C]//Open Review. net. International Conference on Learning Representations 2016. San Juan, Puerto Rico: OpenReview. net, 2016: 1-14.
    [31] BRAKEL D B P, GOYAL K X A, PINEAU RL J, et al. An actor-critic algorithm for sequence prediction[C]//Open Review. net. International Conference on Learning Representations 2017. Palais des Congrès Neptune: OpenReview. net, 2017: 1-17.
  • 加载中
图(28) / 表(10)
计量
  • 文章访问数:  1719
  • HTML全文浏览量:  422
  • PDF下载量:  283
  • 被引次数: 0
出版历程
  • 收稿日期:  2021-12-23
  • 刊出日期:  2022-06-25

目录

    /

    返回文章
    返回