基于多智能体近端策略优化的低空异构飞行器实时三维冲突解脱方法

陈运翔; 苟明; 张建平; 芦维宁; 唐凯; 张光远

doi:10.19818/j.cnki.1671-1637.2026.092

基于多智能体近端策略优化的低空异构飞行器实时三维冲突解脱方法

doi: 10.19818/j.cnki.1671-1637.2026.092

陈运翔^{1, 2,},
苟明³,
张建平^{1, 2, ,},
芦维宁⁴,
唐凯³,
张光远^{1, 2}

1.
西南交通大学交通运输与物流学院, 四川成都 610031
2.
低空交通智能管控四川省重点实验室, 四川成都 610031
3.
北京信息科技大学机电工程学院, 北京 100192
4.
清华大学北京信息科学与技术国家研究中心, 北京 100084

基金项目:

国家重点研发计划 2022YFB4300903

国家自然科学基金民航联合研究项目 U2433217

国家自然科学基金项目 52472332

四川省重大科技专项揭榜挂帅项目 2024ZDZX0044

四川省自然科学基金项目 2025ZNSFSC0394

详细信息

作者简介:
陈运翔（1992-），男，四川宜宾人，助理研究员，博士，博士后，E-mail：chenyunxiang@swjtu.edu.cn

通讯作者:
张建平（1976-），男，安徽芜湖人，研究员，工学博士，E-mail：zhangjp@swjtu.edu.cn

中图分类号: U8
计量
- 文章访问数: 56
- HTML全文浏览量: 35
- PDF下载量: 15
- 被引次数: 0
出版历程
- 收稿日期: 2025-08-31
- 录用日期: 2025-11-27
- 修回日期: 2025-10-14
- 刊出日期: 2026-03-28

Real-time 3D conflict resolution method for low-altitude heterogeneous aircraft based on multi-agent proximal policy optimization

CHEN Yun-xiang^{1, 2
,},
GOU Ming³,
ZHANG Jian-ping^{1, 2
, ,},
LU Wei-ning⁴,
TANG Kai³,
ZHANG Guang-yuan^{1, 2}

1.
School of Transportation and Logistics, Southwest Jiaotong University, Chengdu 610031, Sichuan, China
2.
Intelligent Management and Control of Low-altitude Traffic Key Laboratory of Sichuan Province, Chengdu 610031, Sichuan, China
3.
College of Mechanical and Electrical Engineering, Beijing Information Science Technology University, Beijing 100192, China
4.
Beijing National Research Center for Information Science and Technology, Tsinghua University, Beijing 100084, China

Funds:

National Key R&D Program 2022YFB4300903

Civil Aviation Joint Research Fund of National Natural Science Foundation of China U2433217

National Natural Science Foundation of China 52472332

Sichuan Provincial Major Science and Technology Special Project - Tackling Key Problems Initiative 2024ZDZX0044

Natural Science Foundation of Sichuan Province 2025ZNSFSC0394

More Information

Corresponding author: ZHANG Jian-ping, research fellow, PhD, Email: zhangjp@swjtu.edu.cn

Article Text (Baidu Translation)

摘要

摘要: 针对低空异构飞行器实时三维冲突解脱问题，选取了中大型固定翼飞行器与轻小型多旋翼无人机共享空域运行这一类目前发展迅猛的低空运行场景开展研究；采用集中式训练与分布式执行框架，提出了一种基于多智能体近端策略优化（MAPPO）的解决方法；基于两类飞行器的运行特性确立了固定翼飞机稳定飞行、多旋翼无人机机动避让的实时三维冲突解脱策略，构建了兼顾碰撞避免、任务效率、优先级和平稳性的多维奖励函数；引入了优先级机制以保障固定翼飞机的任务优先性，同时引导多旋翼无人机主动避让。仿真试验表明：选取5、10、20、30架次轻小型多旋翼无人机仿真飞行过程开展基准试验均可实现92%以上任务成功率，计算开销为0.16~0.36 min，平均冲突解脱时间为0.28~1.76 s，飞行冲突占比为0.95%~2.18%，通过优化状态空间、动作空间和奖励函数，该方法在冲突解脱时间上优于现有方法2.25 s，任务成功率上提高2%，为进一步在广域范围开展低空异构飞行器融合运行研究奠定了基础。
- 低空交通 /
- 冲突解脱 /
- 多智能体近端策略优化 /
- 低空异构飞行器 /
- 低空航行系统
Abstract: In response to the real-time three-dimensional conflict resolution for low-altitude heterogeneous aircraft, a rapidly developing operational scenario was studied, including shared airspace operations between medium-to-large fixed-wing aircraft and light small multi-rotor unmanned aerial vehicles (UAVs). A multi-agent proximal policy optimization (MAPPO)-based method was proposed with a centralized training and decentralized execution framework. Based on the operational characteristics of the two types of aircraft, a real-time three-dimensional conflict resolution strategy was established to allow fixed-wing aircraft to maintain stable flight while multi-rotor UAVs perform avoidance maneuvers. A multi-dimensional reward function was designed, taking into account collision avoidance, mission efficiency, priority, and smoothness. A priority mechanism was introduced to ensure the mission priority of fixed-wing aircraft and encourage proactive avoidance by multi-rotor UAVs. Simulation results show that baseline tests involving 5, 10, 20, and 30 light small multi-rotor UAVs all achieve a mission success rate of over 92%, with computational overhead ranging from 0.16 to 0.36 min, average conflict resolution time between 0.28 and 1.76 s, and flight conflict proportions between 0.95% and 2.18%. Through optimization of the state space, action space, and reward function, the proposed method reduces conflict resolution time by 2.25 s and improves mission success rate by 2% compared to existing methods. A foundation is thus laid for further research on the integrated operation of low-altitude heterogeneous aircraft in wide-area scenarios.
- low-altitude traffic /
- conflict resolution /
- multi-agent proximal policy optimization /
- low-altitude heterogeneous aircraft /
- low-altitude navigation system

HTML全文

图 1 MAPPO算法框架

Figure 1. Framework of the MAPPO algorithm

下载: 全尺寸图片幻灯片

图 2 MAPPO算法网络结构

Figure 2. Network structure of the MAPPO algorithm

下载: 全尺寸图片幻灯片

图 3 局部观测矩阵和飞行器分类

Figure 3. Local observation matrix and aircraft classification

下载: 全尺寸图片幻灯片

图 4 轻小型多旋翼无人机飞行冲突解脱仿真矢量图（试验1）

Figure 4. Simulation vector diagrams of flight conflict resolution for light small multi-rotor UAVs (experiment 1)

下载: 全尺寸图片幻灯片

图 5 完整奖励下异构飞行器冲突解脱仿真矢量图（试验2）

Figure 5. Simulation vector diagrams of flight conflict resolution for heterogeneous aircraft under comprehensive reward (experiment 2)

下载: 全尺寸图片幻灯片

图 6 空域连通性受损的异构飞行器冲突解脱仿真矢量图（试验5）

Figure 6. Simulation vector diagrams of flight conflict resolution for heterogeneous aircraft in degraded airspace connectivity (experiment 5)

下载: 全尺寸图片幻灯片

表 1 算法参数

Table 1. Algorithm parameters

参数	取值
折旧因子	0.99
学习率	0.000 04
批量大小	64
剪切函数参数	0.2

下载: 导出CSV

表 2 轻小型多旋翼无人机参数

Table 2. Parameters of light small multi-rotor UAV

参数	取值
轻小型多旋翼无人机初始速度/（m·s^-1）	15
轻小型多旋翼无人机安全间隔/m	250
轻小型多旋翼无人机最大航向调整量/（°）	60
轻小型多旋翼无人机速度调整范围/（m·s^-1）	[12.75，17.25]
轻小型多旋翼无人机平稳性权重	1

下载: 导出CSV

表 3 中大型固定翼飞机参数

Table 3. Parameters of medium-to-large fixed-wing aircraft

参数	取值
中大型固定翼飞机初始速度/（km·h^-1）	135
中大型固定翼飞机安全间隔/km	2
中大型固定翼飞机最大航向调整量/（°）	30
中大型固定翼飞机速度调整范围/（km·h^-1）	[115，155]
中大型固定翼飞机平稳性权重	2

下载: 导出CSV

表 4 轻小型多旋翼无人机飞行冲突解脱仿真试验结果（试验1）

Table 4. Simulation results of conflict resolution for light small multi-rotor UAVs (experiment 1)

飞行器数量/架次	飞行冲突占比/%	平均冲突解脱时间/s	任务成功率/%	计算时间/min
5	0.95	0.28	95	0.16
10	1.03	0.33	94	0.19
20	1.44	0.89	92	0.26
30	2.18	1.79	92	0.36

下载: 导出CSV

表 5 完整奖励下异构飞行器冲突解脱仿真结果（试验2）

Table 5. Simulation results of conflict resolution for heterogeneous aircraft under comprehensive reward (experiment 2)

飞行器数量/架次	飞行冲突占比/%	平均冲突解脱时间/s	任务成功率/%	计算开销/min
5	3.23	0.50	98	0.08
10	4.35	1.67	92	0.22
20	5.28	1.55	90	0.24
30	5.38	2.33	88	0.33

下载: 导出CSV

表 6 无优先级奖励下异构飞行器冲突解脱仿真结果（试验3）

Table 6. Simulation results of conflict resolution for heterogeneous aircraft under non-prioritized reward (experiment 3)

飞行器数量/架次	飞行冲突占比/%	平均冲突解脱时间/s	任务成功率/%	计算开销/min
5	4.13	1.05	98	0.13
10	4.37	2.28	91	0.16
20	5.93	2.41	90	0.28
30	6.28	3.05	86	0.31

下载: 导出CSV

表 7 无平稳性奖励下异构飞行器冲突解脱仿真结果（试验4）

Table 7. Simulation results of conflict resolution for heterogeneous aircraft under non-stationary rewards (experiment 4)

飞行器数量/架次	飞行冲突占比/%	平均冲突解脱时间/s	任务成功率/%	计算开销/min
5	3.09	0.37	98	0.10
10	3.68	1.42	98	0.18
20	4.95	2.16	90	0.23
30	4.62	2.50	88	0.30

下载: 导出CSV

表 8 空域连通性受损下的异构飞行器冲突解脱仿真结果（试验5）

Table 8. Simulation results of conflict resolution for heterogeneous aircraft in degraded airspace connectivity (experiment 5)

飞行器数量/架次	飞行冲突占比/%	平均冲突解脱时间/s	任务成功率/%	计算开销/min
5	3.45	1.30	95	0.28
10	5.15	1.87	93	0.31
20	5.18	3.00	88	0.43
30	5.41	4.36	86	0.53

下载: 导出CSV

表 9 MAPPO与DQN改进方法的冲突解脱性能对比

Table 9. Performance comparison of MAPPO and advanced DQN method

方法	平均冲突解脱时间/s	任务成功率/%
MAPPO	2.14	90
DQN改进方法	4.39	88

下载: 导出CSV

参考文献(30)

[1]	中国民用航空局. 关于促进民用无人驾驶航空发展的指导意见(征求意见稿)[EB/OL]. (2019-05-14)[2024-06-06]. http://www.caac.gov.cn/HDJL/YJZJ/201905/t20190514_196175.html. Civil Aviation Administration of China. Guiding opinions on promoting the development of civil unmanned aircraft (Draft for Public Comment)[EB/OL]. (2019-05-14)[2024-06-06]. http://www.caac.gov.cn/HDJL/YJZJ/201905/t20190514_196175.html.
[2]	PONS-PRATS J, ŽIVOJINOVIĆ T, KULJANIN J. On the understanding of the current status of urban air mobility development and its future prospects: Commuting in a flying vehicle as a new paradigm [J]. Transportation Research Part E: Logistics and Transportation Review, 2022, 166: 102868. doi: 10.1016/j.tre.2022.102868
[3]	GARROW L A, GERMAN B J, LEONARD C E. Urban air mobility: A comprehensive review and comparative analysis with autonomous and electric ground transportation for informing future research[J]. Transportation Research Part C: Emerging Technologies, 2021, 132: 103377. doi: 10.1016/j.trc.2021.103377
[4]	张洪海, 夷珈, 李姗, 等. 低空空域容量评估研究综述[J]. 交通运输工程学报, 2023, 23(6): 78-93. doi: 10.19818/j.cnki.1671-1637.2023.06.003 ZHANG Hong-hai, YI Jia, LI Shan, et al. Review on research of low-altitude airspace capacity evaluation[J]. Journal of Traffic and Transportation Engineering, 2023, 23(6): 78-93. doi: 10.19818/j.cnki.1671-1637.2023.06.003
[5]	汤新民, 顾俊伟, 张康, 等. 无人驾驶航空器自主探测与避让技术研究综述[J]. 交通运输工程学报, 2026, 26(3): 1-24. doi: 10.19818/j.cnki.1671-1637.2026.085 TANG Xin-min, GU Jun-wei, ZHANG Kang, et al. Research review on the autonomous detect and avoid technologies for unmanned aerial vehicles [J]. Journal of Traffic and Transportation Engineering, 2026, 26(3): 1-24. doi: 10.19818/j.cnki.1671-1637.2026.085
[6]	中国民用航空局. 关于发布《国家空域基础分类方法》的通知[EB/OL]. (2023-12-21)[2026-02-27]. http://www.caac.gov.cn/XXGK/XXGK/TZTG/202312/t20231221_222397.html. Civil Aviation Administration of China. Notice on Issuing the Basic Classification Method of National Airspace. (2023-12-21)[2026-02-27]. http://www.caac.gov.cn/XXGK/XXGK/TZTG/202312/t20231221_222397.html.
[7]	李诚龙, 屈文秋, 李彦冬, 等. 面向eVTOL航空器的城市空中运输交通管理综述[J]. 交通运输工程学报, 2020, 20(4): 35-54. doi: 10.19818/j.cnki.1671-1637.2020.04.003 LI Cheng-long, QU Wen-qiu, LI Yan-dong, et al. Overview of traffic management of urban air mobility (UAM)with eVTOL aircraft [J]. Journal of Traffic and Transportation Engineering, 2020, 20(4): 35-54. doi: 10.19818/j.cnki.1671-1637.2020.04.003
[8]	REICH P G. Analysis of long-range air traffic systems: Separation standards: Ⅰ [J]. Journal of Navigation, 1966, 19(1): 88-98. doi: 10.1017/S037346330004056X
[9]	FIORINI P, SHILLER Z. Motion planning in dynamic environments using velocity obstacles [J]. The International Journal of Robotics Research, 1998, 17(7): 760-772. doi: 10.1177/027836499801700706
[10]	BROOKER P. Lateral collision risk in air traffic track systems: A 'post-Reich' event model[J]. Journal of Navigation, 2003, 56(3): 399-409. doi: 10.1017/S0373463303002455
[11]	刘洋, 向锦武, 罗漳平, 等. 低空自由飞行短期冲突探测算法[J]. 北京航空航天大学学报, 2017, 43(9): 1873-1881. LIU Yang, XIANG Jin-wu, LUO Zhang-ping, et al. Short-term conflict detection algorithm for free flight in low-altitude airspace [J]. Journal of Beijing University of Aeronautics and Astronautics, 2017, 43(9): 1873-1881.
[12]	HERNÁNDEZ-ROMERO E, VALENZUELA A, RIVAS D. Probabilistic multi-aircraft conflict detection and resolution considering wind forecast uncertainty [J]. Aerospace Science and Technology, 2020, 105: 105973. doi: 10.1016/j.ast.2020.105973
[13]	管祥民, 吕人力. 基于满意博弈论的复杂低空飞行冲突解脱方法[J]. 航空学报, 2017, 38(增1): 120-128. GUAN Xiang-min, LYU Ren-li. Aircraft conflict resolution method based on satisfying game theory [J]. Acta Aeronautica et Astronautica Sinica, 2017, 38(S1): 120-128.
[14]	张宏宏, 甘旭升, 孙静娟, 等. 基于STPA-TOPAZ的低空无人机冲突解脱安全性分析[J]. 航空学报, 2022, 43(7): 255-267. ZHANG Hong-hong, GAN Xu-sheng, SUN Jing-juan, et al. Analysis of low altitude UAV conflict resolution safety based on STPA-TOPAZ[J]. Acta Aeronautica et Astronautica Sinica, 2022, 43(7): 255-267.
[15]	张启钱, 王中叶, 张洪海, 等. 基于SMILO-VTAC模型的复杂低空多机冲突解脱方法[J]. 交通运输工程学报, 2019, 19(6): 125-136. doi: 10.19818/j.cnki.1671-1637.2019.06.012 ZHANG Qi-qian, WANG Zhong-ye, ZHANG Hong-hai, et al. SMILO-VTAC model based multi-aircraft conflict resolution method in complex low-altitude airspace[J]. Journal of Traffic and Transportation Engineering, 2019, 19(6): 125-136. doi: 10.19818/j.cnki.1671-1637.2019.06.012
[16]	陈运翔, 张建平, 王致远, 等. 基于机动避撞策略的低空多旋翼无人机安全间隔计算模型[J]. 航空学报, 2025, 46(11): 349-365. CHEN Yun-xiang, ZHANG Jian-ping, WANG Zhi-yuan, et al. Safety separation calculation model for multi-rotor drones in low-altitude airspace based on avoidance strategy[J]. Acta Aeronautica et Astronautica Sinica, 2025, 46(11): 349-365.
[17]	谷志鸣, 高文明, 魏潇龙, 等. 基于TOPAZ的无人机冲突解脱安全评估技术[J]. 安全与环境学报, 2016, 16(5): 51-56. GU Zhi-ming, GAO Wen-ming, WEI Xiao-long, et al. Safety assessment technology of UAV conflict resolution based on the TOPAZ method[J]. Journal of Safety and Environment, 2016, 16(5): 51-56.
[18]	朱代武. 低空空域飞行冲突避让算法[J]. 交通运输工程学报, 2005, 5(3): 73-76. https://transport.chd.edu.cn/article/id/200503016 ZHU Dai-wu. Calculational methods of avoiding flight conflict in low altitude airspace[J]. Journal of Traffic and Transportation Engineering, 2005, 5(3): 73-76. https://transport.chd.edu.cn/article/id/200503016
[19]	PHAM H, BALASOORIYAN P, YILMAZ Y, et al. Conflict resolution for unmanned aerial vehicles using deep reinforcement learning[J]. Journal of Intelligent Robotic Systems, 2022, 95(3): 629-644.
[20]	LOQUERCIO A, MAQUEDA A I, DEL-BLANCO C R, et al. DroNet: Learning to fly by driving[J]. IEEE Robotics and Automation Letters, 2018, 3(2): 1088-1095. doi: 10.1109/LRA.2018.2795643
[21]	LIN C E, LAI Y H. UAV path prediction for CDR to manned aircraft in a confined airspace for cooperative mission[J]. International Journal of Aerospace Engineering, 2018, 2018: 8759836.
[22]	JILKOV V P, LEDET J H, LI X R. Multiple model method for aircraft conflict detection and resolution in intent and weather uncertainty[J]. IEEE Transactions on Aerospace and Electronic Systems, 2019, 55(2): 1004-1020. doi: 10.1109/TAES.2018.2867698
[23]	ZHAO X, LIU Y. Generalised single-agent reinforcement learning for multi-aircraft conflict resolution[J]. Aerospace Science and Technology, 2021, 112: 106649. doi: 10.1016/j.ast.2021.106649
[24]	LAI Z, ZHENG Z, QIU S, et al. Multi-agent deep deterministic policy gradient for air traffic conflict resolution[J]. Aerospace Science and Technology, 2021, 115: 106797. doi: 10.1016/j.ast.2021.106797
[25]	CHEN Y T, XU Y, YANG L, et al. General real-time three-dimensional multi-aircraft conflict resolution method using multi-agent reinforcement learning [J]. Transportation Research Part C: Emerging Technologies, 2023, 157: 104367. doi: 10.1016/j.trc.2023.104367
[26]	DONG S, LI W, LIU S, et al. Deep reinforcement learning for multi-agent conflict resolution in 3D airspace[J]. Aerospace Science and Technology, 2021, 110: 106412.
[27]	BRITTAIN M, WEI P. Long short-term memory network for aircraft conflict detection and resolution[J]. Journal of Guidance, Control, and Dynamics, 2021, 44(2): 330-342.
[28]	DALMAU R, ALLARD E. Air traffic control using message passing neural networks and multi-agent reinforcement learning[C]// SIDs. 10th SESAR Innovation Days. Brussels: SESAR, 2020: 158-167.
[29]	YU C, VELU A, VINITSKY E, et al. The surprising effectiveness of PPO in cooperative multi-agent reinforcement games[C]// NeurIPS. 36th Conference on Neural Information Processing Systems, San Diego: NeurIPS, 2022: 24611-24624.
[30]	WACHI A, SHEN X, SUI Y. A survey of constraint formulations in safe reinforcement learning[C]// IJCAI. Proceedings of the 33rd International Joint Conference on Artificial Intelligence. California: IJCAI, 2024: 8262-8271.