基于值迭代的无人机动态避撞优化方法

魏志强; 安心

doi:10.19818/j.cnki.1671-1637.2026.153

基于值迭代的无人机动态避撞优化方法

doi: 10.19818/j.cnki.1671-1637.2026.153

魏志强^,,
安心

中国民航大学空中交通管理学院, 天津 300300

基金项目:

天津市科技计划项目 23JCZDJC00580

详细信息

作者简介:
魏志强(1979-)，男，河南渑池人，教授，E-mail: weizhiqia@sina.com

通讯作者:
WEI Zhi-qiang, professor, E-mail: weizhiqia@sina.com

中图分类号: U8
计量
- 文章访问数: 13
- HTML全文浏览量: 9
- PDF下载量: 2
- 被引次数: 0
出版历程
- 收稿日期: 2025-07-29
- 录用日期: 2026-01-23
- 修回日期: 2025-12-11
- 刊出日期: 2026-03-28

A dynamic collision avoidance method for UAVs using value iteration

WEI Zhi-qiang^,,
AN Xin

College of Air Traffic Management, Civil Aviation University of China, Tianjin 300300, China

Funds:

Tianjin Science and Technology Program 23JCZDJC00580

Article Text (Baidu Translation)

摘要

摘要: 针对无人机飞行冲突自主解脱需要，提出了一种基于值迭代方法的马尔可夫决策过程优化模型。首先构建了值迭代动态避撞模型，实现无人机的实时安全避撞；然后针对空域的复杂性和不确定性问题，构建了涵盖两机相对高度、本机与入侵机的垂直速度、历史动作及时间等参数的精细化状态空间集；之后通过构建多因素动态成本函数，综合考虑冲突风险、最接近时间等因素进行动作判断，减少了无人机避撞时的不必要机动操作；最后提出通过引入自适应双层概率融合机制，解决传统确定性决策在复杂动态环境中的脆弱性问题，提高决策的鲁棒性。仿真试验结果表明：提出的动态避撞方法在3个仅考虑动态入侵机的冲突场景中可以实现无人机的安全避撞，两机最终相对高度分别为152.5、188.0、143.7 m；在同时考虑静态障碍物和动态入侵机的混合冲突场景中，本机与静态障碍物的最小垂直相对高度为174.7 m，两机的垂直相对高度为230.7 m，可以保证无人机安全飞行；与动态窗口法方法相比，4个场景下本机执行基于值迭代的避撞策略后，平均过度位置调整高度减少了62.4%，平均不必要的动作切换次数减少了88%。提出的基于值迭代的动态规划方法解决无人机避撞场景下的马尔可夫决策过程问题是可行的，无人机可以实现安全避撞。
- 航空安全 /
- 动态避撞 /
- 值迭代 /
- 动态规划 /
- 无人机 /
- 低空交通 /
- 马尔可夫决策过程
Abstract: A Markov decision process (MDP) optimization model based on value iteration was proposed for the needs of autonomous conflict resolution of unmanned aerial vehicles. A value iteration-based dynamic collision avoidance model was first constructed to achieve real-time safe collision avoidance. To address the complexity and uncertainty of airspace, a refined state space was formulated, incorporating parameters such as relative altitude between two aircraft, vertical speeds of ownship and intruder, historical actions, and time. A multi-factor dynamic cost function was designed to integrate conflict risk and time to closest approach for action judgement, thereby reducing unnecessary maneuvers during collision avoidance. An adaptive two-layer probabilistic fusion mechanism was introduced to address the vulnerability of traditional deterministic decision-making in complex dynamic environments and improve decision robustness. The results indicate that the proposed dynamic collision avoidance method can achieve safe collision avoidance in three conflict scenarios considering only dynamic intruders, and the final vertical relative heights between two aircraft are 152.5, 188.0, and 143.7 m, respectively. In the mixed conflict scenario considering both static obstacles and dynamic intruders, the minimum vertical relative height between the ownship and static obstacles is 174.7 m, and the vertical relative height between two aircraft is 230.7 m, which ensures the safe flight of unmanned aerial vehicle. Compared with the dynamic window approach (DWA) method, after the ownship executes the collision avoidance strategy based on value iteration in four scenarios, the average excessive altitude adjustment is reduced by 62.4%, and the average number of unnecessary action switches is reduced by 88%. It is indicated that the proposed dynamic programming method based on value iteration is feasible to solve the Markov decision process problem in collision avoidance scenarios, and the unmanned aerial vehicle can achieve safe collision avoidance.
- aviation safety /
- dynamic collision avoidance /
- value iteration /
- dynamic programming /
- unmanned aerial vehicle /
- low-altitude traffic /
- Markov decision process

HTML全文

图 1 无人机碰撞示意

Figure 1. Schematic of UAV collision

下载: 全尺寸图片幻灯片

图 2 垂直机动示意

Figure 2. Schematic of vertical maneuver

下载: 全尺寸图片幻灯片

图 3 执行动作效果

Figure 3. Execute action effect

下载: 全尺寸图片幻灯片

图 4 基于值迭代的动态避撞模型构建流程

Figure 4. Construction flow of value-iteration-based dynamic collision-avoidance model

下载: 全尺寸图片幻灯片

图 5 状态转移模型计算流程

Figure 5. Process of state transition model calculation

下载: 全尺寸图片幻灯片

图 6 同高度相向航线避撞

Figure 6. Collision avoidance for head-on aircraft at the same altitude

下载: 全尺寸图片幻灯片

图 7 两机相向航线避撞

Figure 7. Collision avoidance for aircraft on head-on trajectories

下载: 全尺寸图片幻灯片

图 8 无交点航线避撞

Figure 8. Collision avoidance for aircraft on non-intersecting trajectories

下载: 全尺寸图片幻灯片

图 9 考虑静态障碍物的入侵机爬升航线避撞

Figure 9. Collision avoidance for intruder aircraft's climb trajectory in the presence of static obstacles

下载: 全尺寸图片幻灯片

图 10 同高度相向航线避撞

Figure 10. Collision avoidance for head-on aircraft at the same altitude

下载: 全尺寸图片幻灯片

图 11 两机相向航线避撞

Figure 11. Collision avoidance for aircraft on head-on trajectories

下载: 全尺寸图片幻灯片

图 12 使用经典值迭代方法考虑静态障碍物的入侵机爬升航线避撞

Figure 12. Collision avoidance for intruder aircraft's climb trajectory in the presence of static obstacles using the classical value iterative method

下载: 全尺寸图片幻灯片

图 13 4个场景下DWA避撞方法

Figure 13. DWA collision avoidance algorithm in four scenarios

下载: 全尺寸图片幻灯片

图 14 两种方法避撞性能对比

Figure 14. Performance comparison of the two collision-avoidance methods

下载: 全尺寸图片幻灯片

表 1 动作空间

Table 1. Action space

动作	垂直速度/(m·s^-1)		垂直加速度	前一状态
动作	最大	最小	垂直加速度	前一状态
COC	∞	-∞	0	ALL
DES	∞	-8	0.25g	COC
CL	8	-∞	0.25g	COC

下载: 导出CSV

表 2 离散状态变量

Table 2. Discrete state variable

状态变量	离散范围	离散尺度	离散值数量
h_r	-300、-270、...、300 m	30 m	21
v₁	-8、-7、...、8 m·s^-1	1 m·s^-1	17
v₂	-8、-7、...、8 m·s^-1	1 m·s^-1	17
a_p	COC、DES、CL		3
d	1、2、3 s	1 s	3
t	1、2、...、45 s	1 s	45

下载: 导出CSV

表 3 基本动作成本

Table 3. Basic action cost

动作	C₁(a)成本/10^-2
COC	0.1
DES	5.6
CL	5.6

下载: 导出CSV

表 4 动作切换成本

Table 4. Action switching cost

切换动作情况	C₂(a, a_p)成本
a=a_p	0.00
a≠a_p	0.01
a≠a_p且a不等于COC且a_p不等于COC	0.06

下载: 导出CSV

表 5 基于高度范围的成本

Table 5. Altitude-based cost

相对高度范围/m	C₃(h_r)成本
137≤h_r	-0.30
61≤h_r＜137	-0.01
30≤h_r＜61	0.00
15≤h_r＜30	0.05
h_r＜15	0.15

下载: 导出CSV

表 6 基于与冲突点的时间和距离的成本

Table 6. Time-to-collision based cost

决策时机	C₄(h_r, a, t_c)成本
a不等于COC，t_c＞30	0.25
a不等于COC，20＜t_c≤30	0.10
a等于COC，h_r＜61，t_c＜10	0.15

下载: 导出CSV

表 7 危险程度分级

Table 7. Risk classification

危险级别	相对高度范围/m
安全	137≤h_r
较安全	61≤h_r＜137
警戒	30≤h_r＜61
危险	15≤h_r＜30
极危险	h_r＜15

下载: 导出CSV

表 8 经典值迭代方法试验参数设置

Table 8. Experimental parameter settings for the classical value iterative method

试验设置	状态空间规模/个	垂直速度/ (m·s^-1)		垂直加速度		成本函数/ 10^-2
参数	2 457 945	小	大	COC	0.00	COC	0.1
		-8	8	DES	0.25g	DES	5.6
				CL	0.25g	CL	5.6

下载: 导出CSV

表 9 值迭代方法4个场景相关结果

Table 9. Results of value iteration algorithm in four scenarios

场景	场景1	场景2	场景3	场景4
最终相对高度/m	152.5	188.0	143.7	230.7
动作切换次数	5	3	2	2

下载: 导出CSV

表 10 DWA方法4个场景相关结果

Table 10. Results of the DWA method in four scenarios

场景	场景1	场景2	场景3	场景4
最终相对高度/m	214.4	247.0	182.8	350.5
动作切换次数	31	18	31	20

下载: 导出CSV

参考文献(26)

[1]	MANFREDI G, JESTIN Y. An introduction to ACAS Xu and the challenges ahead[C]//IEEE. 2016 IEEE/AIAA 35th Digital Avionics Systems Conference (DASC). New York: IEEE, 2016: 1-9.
[2]	周志崇, 赵顾颢, 吴亚荣, 等. 局部空域高密度无人机冲突解脱算法研究[J/OL]. 北京航空航天大学学报, 2025, https://doi.org/10.13700/j.bh.1001-5965.2025.0157. ZHOU Zhi-chong, ZHAO Gu-hao, WU Ya-rong, et al. Research on conflict resolution algorithms for high-density UAVs in local airspace[J/OL]. Journal of Beijing University of Aeronautics and Astronautics, 2025, https://doi.org/10.13700/j.bh.1001-5965.2025.0157.
[3]	张启钱, 王中叶, 张洪海, 等. 基于SMILO-VTAC模型的复杂低空多机冲突解脱方法[J]. 交通运输工程学报, 2019, 19(6): 125-136. doi: 10.19818/j.cnki.1671-1637.2019.06.012 ZHANG Qi-qian, WANG Zhong-ye, ZHANG Hong-hai, et al. Multi-aircraft conflict resolution method for complex low-altitude airspace based on the SMILO-VTAC model[J]. Journal of Traffic and Transportation Engineering, 2019, 19(6): 125-136. doi: 10.19818/j.cnki.1671-1637.2019.06.012
[4]	陈思名. ACAS X系统监视跟踪与冲突解脱模块的研究与实现[D]. 成都: 电子科技大学, 2022. CHEN Si-ming. Research and implementation of ACAS X system surveillance and tracking module and threat resolution module[D]. Chengdu: University of Electronic Science and Technology of China, 2022.
[5]	陈丹, 汤程, 谢宇, 等. 面向城市低空物流配送的无人机实时航迹双层规划[J]. 航空学报, 2025, 46(16): 229-247. CHEN Dan, TANG Cheng, XIE Yu, et al. Real-time dual-layer trajectory planning for UAVs in urban low-altitude logistics delivery[J]. Acta Aeronautica et Astronautica Sinica, 2025, 46(16): 229-247.
[6]	张云燕, 魏瑶, 刘昊, 等. 基于深度强化学习的端到端无人机避障决策[J]. 西北工业大学学报, 2022, 40(5): 1055-1064. ZHANG Yun-yan, WEI Yao, LIU Hao, et al. End-to-end UAV obstacle avoidance decision based on deep reinforcement learning[J]. Journal of Northwestern Polytechnical University, 2022, 40(5): 1055-1064.
[7]	刘钊瑄, 师可, 许逸凡. 融合空域无人机动态安全间隔模型与风险评估[J/OL]. 西安电子科技大学学报, 2025, https://doi.org/10.19665/j.issn1001-2400.20241108. LIU Zhao-xuan, SHI Ke, XU Yi-fan. UAS dynamic separation design and collision risk analysis in integrated airspace[J/OL]. Journal of Xidian University, 2025, https://doi.org/10.19665/j.issn1001-2400.20241108.
[8]	常绪成, 王敬宇, 李康, 等. 基于改进DWA融合算法的多无人机避障方法[J/OL]. 弹箭与制导学报, 2025, http://kns.cnki.net/kcms/detail/61.1234.tj.20250227.1118.002.html. CHANG Xu-cheng, WANG Jing-yu, LI Kang, et al. Multi-UAV obstacle avoidance method based on improved DWA fusion algorithm[J/OL]. Journal of Projectiles, Rockets, Missiles and Guidance, 2025, http://kns.cnki.net/kcms/detail/61.1234.tj.20250227.1118.002.html.
[9]	祁云, 于开旺, 李绪萍, 等. 融合CPO-DWA的矿井应急无人机路径规划[J/OL]. 安全与环境学报, 2025, https://doi.org/10.13637/j.issn.1009-6094.2024.2198. QI Yun, YU Kai-wang, LI Xu-ping, et al. Mine emergency UAV path planning integrating CPO-DWA[J/OL]. Journal of Safety and Environment, 2025, https://doi.org/10.13637/j.issn.1009-6094.2024.2198.
[10]	LI Y, LI J, WANG J, et al. Multi-scale graph enhanced reinforcement learning for conflict resolution in dense UAV networks[J]. IEEE Internet of Things Journal, 2025: 1-14.
[11]	TEMIZER S, KOCHENDERFER M, KAELBLING L, et al. Collision avoidance for unmanned aircraft using Markov decision processes[C]//AIAA. AIAA Guidance, Navigation, and Control Conference. Reston: AIAA, 2010: 8040.
[12]	KOCHENDERFER M J, CHRYSSANTHACOPOULOS J P. Partially-controlled Markov decision processes for collision avoidance systems[C]//FILIPE J, FRED A. Proceedings of the 3rd International Conference on Agents and Artificial Intelligence (ICAART 2011). Berlin: Springer, 2011: 61-70.
[13]	MUELLER E R, KOCHENDERFER M. Multi-rotor aircraft collision avoidance using partially observable Markov decision processes[C]//AIAA. AIAA Modeling and Simulation Technologies Conference. Reston: AIAA, 2016: 3673.
[14]	SUNBERG Z N, KOCHENDERFER M J, PAVONE M. Optimized and trusted collision avoidance for unmanned aerial vehicles using approximate dynamic programming[C]//IEEE. 2016 IEEE International Conference on Robotics and Automation (ICRA). New York: IEEE, 2016: 1455-1461.
[15]	JIANG W, LYU Y, LI Y, et al. UAV path planning and collision avoidance in 3D environments based on POMPD and improved grey wolf optimizer[J]. Aerospace Science and Technology, 2022(121): 1-11.
[16]	AL-HUSSEINI M, WRAY K H, KOCHENDERFER M J. Hierarchical framework for optimizing wildfire surveillance and suppression using human-autonomous teaming[J]. Journal of Aerospace Information Systems, 2024, 21(10): 22.
[17]	OWEN M P, PANKEN A, MOSS R, et al. ACAS Xu: Integrated collision avoidance and detect and avoid capability for UAS[C]//IEEE. 2019 IEEE/AIAA 38th Digital Avionics Systems Conference (DASC). New York: IEEE, 2019: 1-10.
[18]	OWEN M P, KOCHENDERFER M J. Dynamic logic selection for unmanned aircraft separation[C]//IEEE. 2016 IEEE/AIAA 35th Digital Avionics Systems Conference (DASC). New York: IEEE, 2016: 1-8.
[19]	汤新民, 顾俊伟, 刘冰, 等. 低空监视技术及其发展趋势综述[J]. 南京航空航天大学学报, 2024, 56(6): 973-993. TANG Xin-min, GU Jun-wei, LIU Bing, et al. Review on low-altitude surveillance technology and its development trend[J]. Journal of Nanjing University of Aeronautics and Astronautics, 2024, 56(6): 973-993.
[20]	RORIE R C, SMITH C, SADLER G, et al. A human-in-the-loop evaluation of ACAS Xu[C]//IEEE. 2020 AIAA/IEEE 39th Digital Avionics Systems Conference (DASC). New York: IEEE, 2020: 1-10.
[21]	STROEVE S, VILLANUEVA-CAÑIZARES C J, DEAN G. Remote pilot modelling for evaluation of ACAS Xu[J]. Proceedings of the SESAR Innovation Days, 2023, 5: 1-8.
[22]	张国林, 曾喆昭, 唐钰淇. 四旋翼无人机轨迹跟踪的ACPD控制方法[J]. 兵器装备工程学报, 2025, 46(5): 185-191, 271. ZHANG Guo-lin, ZENG Zhe-zhao, TANG Yu-qi. ACPD control method for trajectory tracking of quadrotor UAVs[J]. Journal of Ordnance Equipment Engineering, 2025, 46(5): 185-191, 271.
[23]	申炎, 张学军, 张维东. 基于MAD3QN的多无人机协同避撞方法[J/OL]. 计算机工程与应用, 2025, http://kns.cnki.net/kcms/detail/11.2127.tp.20250704.1214.004.html. SHEN Yan, ZHANG Xue-jun, ZHANG Wei-dong. Multi-UAV cooperative collision avoidance method based on MAD3QN[J/OL]. Computer Engineering and Applications, 2025, http://kns.cnki.net/kcms/detail/11.2127.tp.20250704.1214.004.html.
[24]	高雅琪. 无人机系统中DAA模块的研究和设计实现[D]. 成都: 电子科技大学, 2022. GAO Ya-qi. Research design and implementation on DAA module of UAV System[D]. Chengdu: University of Electronic Science and Technology of China, 2022.
[25]	NEU G, JONSSON A, GÓMEZ V. A unified view of entropy- regularized Markov decision processes[EB/OL](2017-05-22). https://doi.org/10.48550/arXiv.1705.07798.
[26]	刘连玉, 巩在武, 张雪, 等. 应急情景下融合改进DLite算法和DWA算法的无人驾驶汽车路径规划[J/OL]. 控制与决策, 2025, https://doi.org/10.13195/j.kzyjc.2025.0009. LIU Lian-yu, GONG Zai-wu, ZHANG Xue, et al. Emergency- scenario path planning for autonomous vehicles integrating improved DLite and DWA algorithms[J/OL]. Control and Decision, 2025, https://doi.org/10.13195/j.kzyjc.2025.0009.