Lightweight YOLOv8-ALTE algorithm for bridge crack disease detection
-
摘要: 针对复杂条件下桥梁裂缝检测方法效率低下、检测精度较低及漏检率较高等问题,提出一种基于改进YOLOv8的轻量化算法YOLOv8-ALTE。以YOLOv8-N模型为基础,将其C2f模块融合一种具备感知多尺度特征信息的轻量化卷积模块ALConv,以丰富所提取特征图中的裂缝信息;在网络特征提取模块浅层网络中嵌入三元注意力,以提高模型对桥梁裂缝病害的定位及识别准确度;通过参数共享设计了轻量化解耦头代替原解耦头,可有效降低模型计算复杂度;引入多参数距离交并比损失代替原回归损失,使模型可具备更高边界框回归效率及精度;通过人工标注方式构建了多种复杂背景条件下的桥梁裂缝检测图像数据集,采取多种数据增强方式对其进行整理及扩充,利用精确率、召回率、平均精度AP50与AP50-95及浮点运算次数FLOPs作为定量评价指标,通过对比、模块融合、注意力结合及消融试验对模型进行综合评估。试验结果表明:YOLOv8-ALTE精确率、召回率、平均精度AP50与AP50-95及FLOPs分别为93.9%、83.5%、89.0%、73.8%及8.0,在综合性能上均优于原YOLOv8-N及各对比模型,论证了所提出算法YOLOv8-ALTE的优越性,可在运算效率提升的同时对桥梁裂缝进行高效精确识别。Abstract: To address low efficiency, poor detection accuracy, and high missed detection rates in bridge crack detection under complex conditions, a lightweight algorithm named YOLOv8-ALTE based on an improved YOLOv8 was proposed. On the basis of the YOLOv8-N model, its C2f module was integrated with a lightweight convolutional module, ALConv, capable of perceiving multi-scale feature information, to enrich crack-related information in the extracted feature maps. A triplet attention was embedded into the shallow layers of the network's feature extraction module to enhance the model's accuracy of locating and identifying bridge cracks. A lightweight decoupled head, designed with parameter-sharing, replaced the original decoupled head, effectively reducing the computational complexity of the model. Additionally, a multi-parameter distance intersection over union loss was introduced to replace the original regression loss, enabling higher efficiency and accuracy in bounding box regression. A bridge crack detection dataset with various complex background conditions was constructed through manual annotation. Multiple data augmentation techniques were employed to organize and expand the dataset. Precision, recall, average precision (AP50 and AP50-95), and floating point operations (FLOPs) were adopted as quantitative evaluation metrics. The model was evaluated through comparison, module integration, attention mechanism incorporation, and ablation experiments. Experimental results demonstrate that YOLOv8-ALTE achieves a precision of 93.9%, a recall of 83.5%, an AP50 of 89.0%, an AP50-95 of 73.8%, and a FLOPs of 8.0. The comprehensive performance of YOLOv8-ALTE outperforms the original YOLOv8-N and other compared models, proving the superiority of the proposed algorithm. YOLOv8-ALTE enables efficient and accurate detection of bridge cracks while improving computational efficiency.
-
Key words:
- bridge engineering /
- crack detection /
- deep learning /
- lightweight /
- YOLOv8
-
表 1 数据增强方式
Table 1. Data augmentation modes
增强方式 原始图片 改变亮度 镜像翻转 加入噪声 总量 数量/张 1 200 720 720 720 3 360 表 2 数据集划分
Table 2. Partition of data set
图片总量 训练集 验证集 测试集 3 360 2 688 336 336 表 3 分类混淆矩阵
Table 3. Classification confusion matrix
实际值 预测值 正类 负类 正类 TP FN 负类 FP TN 表 4 训练优化超参数设置
Table 4. Settings of training optimization hyperparameter
名称 数值 图片大小/像素 480×480 初始学习率 0.01 预热次数 3 批量大小 16 优化器 随机梯度下降 权重衰减系数 0.000 5 动量参数 0.937 迭代轮次/轮 200 表 5 对比模型试验结果
Table 5. Experiment results of comparative model
模型 精确率/
%召回率/
%AP50/
%AP50-
95/%FLOPs YOLOv5-N 94.0 79.4 87.2 70.1 7.2 YOLOv6-N 93.2 80.8 87.1 71.1 11.9 YOLOv7-Tiny 89.8 86.3 92.1 69.6 13.2 RT-DETR-R18 91.7 78.2 80.9 57.0 58.3 YOLOv8-N 91.4 80.8 88.0 71.4 8.2 YOLOv8-ALTE 93.9 83.5 89.0 73.8 8.0 表 6 不同策略融合试验结果
Table 6. Experiment results of different strategies fusion
模型 融合类型 精确率/% 召回率/% AP50/% AP50-95/% FLOPs 模型1 ContextGuided[42] 91.4 79.7 86.8 70.1 5.9 模型2 ODConv[43] 91.2 78.6 86.1 65.4 5.8 模型3 REPVGGOREPA[44] 87.5 71.6 82.4 61.0 6.8 模型4 RFAConv[45] 90.8 83.9 88.1 72.2 9.6 模型5 SCConv[46] 88.1 70.9 81.3 54.3 7.9 模型6 ALConv 90.7 82.6 88.6 73.0 8.0 YOLOv8-N 91.4 80.8 88.0 71.4 8.2 表 7 注意力机制试验结果
Table 7. Experiment results of mechanisms attention
模型 位置 精确率/
%召回率/
%AP50/
%AP50-
95/%FLOPs YOLOv8-N 91.4 80.8 88.0 71.4 8.2 模型6 90.7 82.6 88.6 73.0 8.0 模型7 池化后 91.0 83.5 88.0 71.0 8.1 模型8 池化前 91.6 82.6 88.6 73.3 8.1 模型9 C2f-2前 92.6 83.2 87.9 72.9 8.1 表 8 各模块消融试验结果
Table 8. Experiment results of different module ablation
模型 添加/替换模块 精确率/% 召回率/% AP50/% AP50-95/% FLOPs YOLOv8-N 91.4 80.8 88.0 71.4 8.2 模型6 ALConv 90.7 82.6 88.6 73.0 8.0 模型10 ALConv +TA 92.6 83.2 87.9 72.9 8.1 模型11 ALConv+TA+AFPN[47] 89.9 82.3 87.6 73.6 10.4 模型12 ALConv+TA+AU 92.3 81.3 87.6 72.2 11.1 模型13 ALConv+TA+ED 92.6 82.0 88.1 73.9 8.0 YOLOv8-ALTE ALConv+TA+ED+M 93.9 83.5 89.0 73.8 8.0 -
[1] 《中国公路学报》编辑部. 中国桥梁工程学术研究综述·2021[J]. 中国公路学报, 2021, 34(2): 1-97.Editorial Department of China Journal of Highway and Transport. Review on China's bridge engineering research: 2021[J]. China Journal of Highway and Transport, 2021, 34(2): 1-97. [2] CHEN Lei-lei, ZHAO Xin-yuan, QIAN Zhen-dong, et al. A systematic review of steel bridge deck pavement in China[J]. Journal of Road Engineering, 2023, 3(1): 1-15. doi: 10.1016/j.jreng.2023.01.003 [3] ZHANG A A, SHANG J, LI B, et al. Intelligent pavement condition survey: Overview of current researches and practices[J]. Journal of Road Engineering, 2024, 4(3): 257-281. doi: 10.1016/j.jreng.2024.04.003 [4] 褚鸿鹄, 袁华青, 龙砺芝, 等. 基于Transformer的高分辨率桥梁裂缝图像级联分割方法[J]. 中国公路学报, 2024, 37(2): 65-76.CHU Hong-hu, YUAN Hua-qing, LONG Li-zhi, et al. High-resolution bridge crack image cascade segmentation method based on Transformer[J]. China Journal of Highway and Transport, 2024, 37(2): 65-76. [5] LIU T, ZHANG L J, ZHOU G X, et al. BC-DUnet-based segmentation of fine cracks in bridges under a complex background[J]. PloS ONE, 2022, 17(3): e0265258. doi: 10.1371/journal.pone.0265258 [6] CHENG Y H, TIAN L L, YIN C, et al. A magnetic domain spots filtering method with self-adapting threshold value selecting for crack detection based on the MOI[J]. Nonlinear Dynamics, 2016, 86(2): 741-750. doi: 10.1007/s11071-016-2918-7 [7] 马亚飞, 孙文康, 何羽, 等. 基于DC-Unet的混凝土桥梁表观裂缝识别方法[J]. 长安大学学报(自然科学版), 2024, 44(3): 66-75.MA Ya-fei, SUN Wen-kang, HE Yu, et al. Surface crack identification method of concrete bridge based on DC-Unet[J]. Journal of Chang'an University (Natural Science Edition), 2024, 44(3): 66-75. [8] HOANG N D, NGUYEN Q L, TRAN V D. Automatic recognition of asphalt pavement cracks using metaheuristic optimized edge detection algorithms and convolution neural network[J]. Automation in Construction, 2018, 94: 203-213. doi: 10.1016/j.autcon.2018.07.008 [9] SHENG P, CHEN L, TIAN J. Learning-based road crack detection using gradient boost decision tree[C]//IEEE. 2018 13th IEEE Conference on Industrial Electronics and Applications (ICIEA). New York: IEEE, 2018: 1228-1232. [10] NOH Y, KOO D, KANG Y M, et al. Automatic crack detection on concrete images using segmentation via fuzzy C-means clustering[C]//IEEE. 2017 International Conference on Applied System Innovation (ICASI). New York: IEEE, 2017: 877-880. [11] QU Z, CHEN Y X, LIU L, et al. The algorithm of concrete surface crack detection based on the genetic programming and percolation model[J]. IEEE Access, 2019, 7: 57592-57603. doi: 10.1109/ACCESS.2019.2914259 [12] ZHANG Z Y, LIU Y P, LIU T C, et al. DAGN: A real- time UAV remote sensing image vehicle detection framework[J]. IEEE Geoscience and Remote Sensing Letters, 2019, 17(11): 1884-1888. [13] CHEN W Y, WANG H F, LI H, et al. Real-time garbage object detection with data augmentation and feature fusion using SUAV low-altitude remote sensing images[J]. IEEE Geoscience and Remote Sensing Letters, 2021, 19: 1-5. [14] JANG K, AN Y K, KIM B, et al. Automated crack evaluation of a high-rise bridge pier using a ring-type climbing robot[J]. Computer-aided Civil and Infrastructure Engineering, 2021, 36(1): 14-29. doi: 10.1111/mice.12550 [15] SZEGEDY C, LIU W, JIA Y Q, et al. Going deeper with convolutions[C]//IEEE. Proceedings of the IEEE International Conference on Computer Vision and Pattern Recognition. New York: IEEE, 2015: 1-9. [16] JIANG P Y, ERGU D, LIU F Y, et al. A review of YOLO algorithm developments[J]. Procedia Computer Science, 2022, 199: 1066-1073. doi: 10.1016/j.procs.2022.01.135 [17] 马亚飞, 孙文康, 何羽, 等. 基于DC-Unet的混凝土桥梁表观裂缝识别方法[J]. 长安大学学报(自然科学版), 2024, 44(3): 66-75.MA Ya-fei, SUN Wen-kang, HE Yu, et al. Surface crack identification method of concrete bridge based on DC-Unet[J]. Journal of Chang'an University (Natural Science Edition), 2024, 44(3): 66-75. [18] XIE X X, CHENG G, WANG J B, et al. Oriented R-CNN for object detection[C]//IEEE. Proceedings of the IEEE/CVF International Conference on Computer Vision. New York: IEEE, 2021: 3520-3529. [19] GIRSHICK R. Fast R-CNN[C]//IEEE. 2015 IEEE International Conference on Computer Vision. New York: IEEE, 2015: 1440-1448. [20] REN S Q, HE K M, GIRSHICK R, et al. Faster R-CNN: towards real-time object detection with region proposal networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(6): 1137-1149. doi: 10.1109/TPAMI.2016.2577031 [21] ZHU J Q, ZHONG J T, MA T, et al. Pavement distress detection using convolutional neural networks with images captured via UAV[J]. Automation in Construction, 2022, 133: 103991. doi: 10.1016/j.autcon.2021.103991 [22] 蒋仕新, 邹小雪, 杨建喜, 等. 复杂背景下基于改进YOLOv8s的混凝土桥梁裂缝检测方法[J]. 交通运输工程学报, 2024, 24(6): 135-147. doi: 10.19818/j.cnki.1671-1637.2024.06.009JIANG Shi-xin, ZOU Xiao-xue, YANG Jian-xi, et al. Concrete bridge crack detection method based on improved YOLO v8s in complex backgrounds[J]. Journal of Traffic and Transportation Engineering, 2024, 24(6): 135-147. doi: 10.19818/j.cnki.1671-1637.2024.06.009 [23] 彭雨诺, 刘敏, 万智, 等. 基于改进YOLO的双网络桥梁表观病害快速检测算法[J]. 自动化学报, 2022, 48(4): 1018-1032.PENG Yu-nuo, LIU Min, WAN Zhi, et al. A dual deep network based on the improved YOLO for fast bridge surface defect detection[J]. Acta Automatica Sinica, 2022, 48(4): 1018-1032. [24] 尹冠生, 高建国, 史明辉, 等. 图像分块下的隧道裂缝识别方法[J]. 交通运输工程学报, 2022, 22(2): 148-159. doi: 10.19818/j.cnki.1671-1637.2022.02.011YIN Guan-sheng, GAO Jian-guo, SHI Ming-hui, et al. Tunnel crack recognition method under image block[J]. Journal of Traffic and Transportation Engineering, 2022, 22(2): 148-159. doi: 10.19818/j.cnki.1671-1637.2022.02.011 [25] 翟军治, 孙朝云, 裴莉莉, 等. 多尺度特征增强的路面裂缝检测方法[J]. 交通运输工程学报, 2023, 23(1): 291-308. doi: 10.19818/j.cnki.1671-1637.2023.01.022ZHAI Jun-zhi, SUN Zhao-yun, PEI Li-li, et al. Pavement crack detection method based on multi-scale feature enhancement[J]. Journal of Traffic and Transportation Engineering, 2023, 23(1): 291-308. doi: 10.19818/j.cnki.1671-1637.2023.01.022 [26] MEI Q P, GVL M, AZIM R. Densely connected deep neural network considering connectivity of pixels for automatic crack detection[J]. Automation in Construction, 2020, 110: 103018. doi: 10.1016/j.autcon.2019.103018 [27] REIS D, HONG J, KUPEC J, et al. Real-time flying object detection with YOLOv8[J]. ArXiv, 2023, DOI: 10.48550/arXiv.2305.09972. [28] DAI L Y, LIU G, HUANG L, et al. Feature transfer method for infrared and visible image fusion via fuzzy lifting scheme[J]. Infrared Physics and Technology, 2021, 114: 103621. doi: 10.1016/j.infrared.2020.103621 [29] HAN K, WANG Y H, TIAN Q, et al. GhostNet: More features from cheap operations[C]//IEEE. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). New York: IEEE, 2020: 1577-1586. [30] HOWARD A G, ZHU M L, CHEN B, et al. MobileNets: Efficient convolutional neural networks for mobile vision applications[J]. ArXiv, 2017, DOI: 10.48550/arXiv.1704.04861. [31] SANDLER M, HOWARD A, ZHU M L, et al. MobileNetV2: Inverted residuals and linear bottlenecks[C]//IEEE. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). New York: IEEE, 2018: 4510-4520. [32] HOWARD A, SANDLER M, CHU G, et al. Searching for MobileNetV3[C]//IEEE. Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV). New York: IEEE, 2020: 1314-1324. [33] MISRA D, NALAMADA T, ARASANIPALAI A U, et al. Rotate to attend: Convolutional triplet attention module[C]// IEEE. Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV). New York: IEEE, 2020: 3138-3147. [34] LIN T Y, GOYAL P, GIRSHICK R, et al. Focal loss for dense object detection[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2020, 42(2): 318-327. doi: 10.1109/TPAMI.2018.2858826 [35] MA S L, XU Y. MPDIoU: A loss for efficient and accurate bounding box regression[J]. ArXiv, 2023, DOI: 10.48550/arXiv.2307.07662. [36] LIU Y H, YAO J, LU X H, et al. DeepCrack: A deep hierarchical feature learning architecture for crack segmentation[J]. Neurocomputing, 2019, 338: 139-153. doi: 10.1016/j.neucom.2019.01.036 [37] ZHANG L, YANG F, ZHANG D Y, et al. Roadcrack detection using deep convolutional neural network[C]//IEEE. Proceedings of the IEEE International Conference on Image Processing. New York: IEEE, 2016: 3708-3712. [38] YANG F, ZHANG L, YU S J, et al. Feature pyramid and hierarchical boosting network for pavement crack detection[J]. IEEE Transactions on Intelligent Transportation Systems, 2020, 21(4): 1525-1535. doi: 10.1109/TITS.2019.2910595 [39] LI C Y, LI L L, JIANG H L, et al. YOLOv6: A single- stage object detection framework for industrial applications[J]. ArXiv, 2022, https://doi.org/10.48550/arXiv.2209.02976. [40] WANG C Y, BOCHKOVSKIY A, LIAO H Y M. YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors[C]//IEEE. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). New York: IEEE, 2023: 7464-7475. [41] ZHAO Y A, LYU W Y, XU S L, et al. DETRs beat YOLOs on real-time object detection[J]. ArXiv, 2023, https://doi.org/10.48550/arXiv.2304.08069. [42] WU T Y, TANG S, ZHANG R, et al. CGNet: A light- weight context guided network for semantic segmentation[J]. IEEE Transactions on Image Processing, 2021, 30: 1169-1179. doi: 10.1109/TIP.2020.3042065 [43] LI C, ZHOU A J, YAO A B. Omni-dimensional dynamic convolution[J]. ArXiv, 2022, https://doi.org/10.48550/arXiv.2209.07947. [44] HU M, FENG J Y, HUA J S, et al. Online convolutional re-parameterization[J]. ArXiv, 2022, DOI: 10.48550/arXiv.2204.00826. [45] ZHANG X, LIU C, SONG T T, et al. RFAConv: Innovating spatial attention and standard convolutional operation[J]. ArXiv, 2023, https://doi.org/10.48550/arXiv.2304.03198. [46] LI J F, WEN Y, HE L H. SCConv: Spatial and channel reconstruction convolution for feature redundancy[C]//IEEE. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). New York: IEEE, 2023: 6153-6162. [47] YANG G Y, LEI J, ZHU Z K, et al. AFPN: Asymptotic feature pyramid network for object detection[J]. ArXiv, 2023, https://doi.org/10.48550/arXiv.2306.15988. -
下载: