Improved SSD model in extraction application of expressway toll station locations from GaoFen 2 remote sensing image
-
摘要: 以高分二号遥感影像中的高速公路收费站为研究对象,选取了北京、山西、河南、广东、福建5个省市2019年的高速公路收费站点位和0.8 m遥感影像,通过图像预处理、样本标注、裁切、数据增强、样本集划分的步骤制作训练样本集;引入“多尺度特征融合”的方法对SSD目标检测模型进行改进,通过增加“转置卷积”和“拼接”操作,将高层次特征图像的语义特征赋予低层次特征图像,以增强上采样质量与特征融合能力,从而提升了模型对小目标收费站的检测效果;将改进SSD模型用于2019年福建省高分二号影像中的收费站点位提取,沿福建省高速公路路网矢量对影像进行自动切片,将切片输入模型中进行目标检测;保留有收费站的切片,使用非极大值抑制去除多余的检测框,将剩余的检测框的坐标变换为中心点的坐标,可以直接输出得到高速公路收费站的中心点矢量,从而实现对于收费站点位的端到端自动化提取。研究结果表明:改进SSD模型的精度、召回率及二者的调和平均数分别为0.86、0.88和0.87,均优于传统的SSD, VGG, Faster R-CNN和特征金字塔网络模型。可见,对收费站点位的自动提取可以大大提高公路管理者的工作效率,有效满足公路管理者的实际工作需求。Abstract: The locations of expressway toll stations from GaoFen 2 remote sensing images were extracted as the research object. Expressway toll stations and 0.8 m remote sensing images of Beijing, Shanxi, Henan, Guangdong and Fujian in 2019 were selected to create a training sample dataset via image preprocessing, sample labeling, cropping, data enhancement, and sample dataset partition. Multiscale feature fusion was introduced to improve the target detection model of the single-shot multibox detector (SSD) by adding two operations, namely, "deconvolution" and "concat." The semantic features of high-level feature maps were assigned to low-level feature maps to enhance the upsampling quality and feature fusion capabilities, thereby improved the detection performance on small targets toll stations. The improved SSD model was applied to extract the locations of toll stations in Fujian in 2019 from GaoFen 2 images. The images were automatically sliced along the Fujian highway network vectors, and the slices were input into the model for target detection. The slices with toll stations were retained, and non-maximum suppression was adopted to remove redundant detection frames. The coordinates of the remaining detection frames were transformed into the coordinates of the center points, and the center point vectors of the expressway toll stations were directly output. Thus, the automatic end-to-end extraction of toll station locations could be realized. Research results show that the accuracy and recall of the improved SSD model and their harmonic average are 0.86, 0.88, and 0.87, respectively, which are higher than those of the conventional SSD, VGG, Faster R-CNN, and Feature Pyramid Networks (FPN) models. Therefore, the proposed automatic extraction method for toll station locations can considerably improve management efficiency and adequately satisfy the actual needs of highway managers. 3 tabs, 7 figs, 35 refs.
-
表 1 SSD模型的各层次特征图像维度
Table 1. Dimensions of different levels of feature maps in SSD model
卷积层4 卷积层7 卷积层8 卷积层9 卷积层10 卷积层11 尺寸 38像素×38像素 19像素×19像素 10像素×10像素 5像素×5像素 3像素×3像素 1像素×1像素 通道数 512 1 024 512 256 256 256 表 2 模型训练中的验证集的平均精度
Table 2. Average accuracy in validation dataset during model training
模型 平均精度 VGG 0.69 Faster R-CNN 0.81 SSD 0.83 FPN 0.92 改进SSD 0.96 表 3 福建省高分二号遥感影像的收费站提取结果评估
Table 3. Evaluation of toll stations extraction results of GaoFen 2 remote sensing images of Fujian Province
模型 P R F VGG 0.51 0.62 0.56 Faster R-CNN 0.74 0.76 0.75 SSD 0.76 0.79 0.77 FPN 0.83 0.84 0.84 改进SSD 0.86 0.88 0.87 -
[1] 薛晨荣, 尹东, 李桂芹, 等. 道路收费站的识别研究[J]. 计算机仿真, 2009, 26(2): 225-228. doi: 10.3969/j.issn.1006-9348.2009.02.057XUE Chen-rong, YIN Dong, LI Gui-qin, et al. A study of toll station recognition[J]. Computer Simulation, 2009, 26(2): 225-228. (in Chinese) doi: 10.3969/j.issn.1006-9348.2009.02.057 [2] 李剑, 梅乐翔, 高薪, 等. 基于卫星遥感图像的收费站位置自动识别与校核[J]. 中国交通信息化, 2019(7): 109-110, 116. https://www.cnki.com.cn/Article/CJFDTOTAL-JTXC201907012.htmLI Jian, MEI Le-xiang, GAO Xin, et al. Automatic recognition and verification of toll station location based on satellite remote sensing images[J]. China ITS Journal, 2019(7): 109-110, 116. (in Chinese) https://www.cnki.com.cn/Article/CJFDTOTAL-JTXC201907012.htm [3] 刘晟, 王卫星, 王珊珊, 等. 模糊航空图像中的道路自动检测方法[J]. 交通运输工程学报, 2015, 15(4): 110-117. https://www.cnki.com.cn/Article/CJFDTOTAL-JYGC201504016.htmLIU Sheng, WANG Wei-xing, WANG Shan-shan, et al. Automatic detection method of roads from fuzzy aerial images[J]. Journal of Traffic and Transportation Engineering, 2015, 15(4): 110-117. (in Chinese) https://www.cnki.com.cn/Article/CJFDTOTAL-JYGC201504016.htm [4] WANG Min, ZHANG Si-qi. Road extraction from high-spatial- resolution remotely sensed imagery by combining multi-profile analysis and extended Snakes model[J]. International Journal of Remote Sensing, 2011, 32(21): 6349-6365. doi: 10.1080/01431161.2010.508801 [5] HEIPKE C, MAYER H, WIEDEMANN C, et al. Evaluation of automatic road extraction[J]. International Archives of Photogrammetry and Remote Sensing, 1997, 32: 151-160. http://ci.nii.ac.jp/naid/10019600414 [6] 李珣, 刘瑶, 李鹏飞, 等. 基于Darknet框架下YOLO v2算法的车辆多目标检测方法[J]. 交通运输工程学报, 2018, 18(6): 142-158. doi: 10.3969/j.issn.1671-1637.2018.06.015LI Xun, LIU Yao, LI Peng-fei, et al. Vehicle multi-target detection method based on YOLO v2 algorithm under darknet framework[J]. Journal of Traffic and Transportation Engineering, 2018, 18(6): 142-158. (in Chinese) doi: 10.3969/j.issn.1671-1637.2018.06.015 [7] ZHOU J, GAO D S, ZHANG D. Moving vehicle detection for automatic traffic monitoring[J]. IEEE Transactions on Vehicular Technology, 2007, 56(1): 51-59. doi: 10.1109/TVT.2006.883735 [8] LEITLOFF J, HINZ S, STILLA U. Vehicle detection in very high resolution satellite images of city areas[J]. IEEE Transactions on Geoscience and Remote Sensing, 2010, 48(7): 2795-2806. doi: 10.1109/TGRS.2010.2043109 [9] KARANTZALOS K, PARAGIOS N. Recognition-driven 2D competing priors towards automatic and accurate building detection[J]. IEEE Transactions on Geoscience and Remote Sensing, 2008, 47(1): 133-144. http://search.ebscohost.com/login.aspx?direct=true&db=buh&AN=37382800&site=ehost-live [10] OK A O, SENARAS C, YUKSEL B. Automated detection of arbitrarily shaped buildings in complex environments from monocular VHR optical satellite imagery[J]. IEEE Transactions on Geoscience and Remote Sensing, 2018, 51(3): 1701-1717. http://ieeexplore.ieee.org/document/6276251 [11] AHMADI S, ZOEJ M J V, EBADI H, et al. Automatic urban building boundary extraction from high resolution aerial images using an innovative model of active contours[J]. International Journal of Applied Earth Observation and Geoinformation, 2010, 12(3): 150-157. doi: 10.1016/j.jag.2010.02.001 [12] HINTON G E, OSINDERO S, TEH Y W. A fast learning algorithm for deep belief nets[J]. Neural Computation, 2006, 18(7): 1527-1554. doi: 10.1162/neco.2006.18.7.1527 [13] BASSANI M, MUSSONE L. Experimental analysis of operational data for roundabouts through advanced image processing[J]. Journal of Traffic and Transportation Engineering (English Edition), 2020, 7(4): 482-497. doi: 10.1016/j.jtte.2019.01.005 [14] DWIVEDI N, SINGH D K, KUSHWAHA D S. Weapon classification using deep convolutional neural network[C]//IEEE. 2019 IEEE Conference on Information and Communication Technology. New York: IEEE, 2019: 9066227. [15] 沙爱民, 童峥, 高杰. 基于卷积神经网络的路表病害识别与测量[J]. 中国公路学报, 2018, 31(1): 1-10. doi: 10.3969/j.issn.1001-7372.2018.01.001SHA Ai-min, TONG Zheng, GAO Jie. Recognition and measurement of pavement disasters based on convolutional neural networks[J]. China Journal of Highway and Transport, 2018, 31(1): 1-10. (in Chinese) doi: 10.3969/j.issn.1001-7372.2018.01.001 [16] 刘占文, 赵祥模, 李强, 等. 基于图模型与卷积神经网络的交通标志识别方法[J]. 交通运输工程学报, 2016, 16(5): 122-131. doi: 10.3969/j.issn.1671-1637.2016.05.014LIU Zhan-wen, ZHAO Xiang-mo, LI Qiang, et al. Traffic sign recognition method based on graphical model and convolutional neural network[J]. Journal of Traffic and Transportation Engineering, 2016, 16(5): 122-131. (in Chinese) doi: 10.3969/j.issn.1671-1637.2016.05.014 [17] XU Yong-yang, XIE Zhong, FENG Ya-xing, et al. Road extraction from high-resolution remote sensing imagery using deep learning[J]. Remote Sensing, 2018, 10(9): 1461. doi: 10.3390/rs10091461 [18] GONG Li-xia, WANG Chao, WU Fan, et al. Earthquake-induced building damage detection with post-event sub-meter VHR TerraSAR-X staring spotlight imagery[J]. Remote Sensing, 2016, 8(11): 887. doi: 10.3390/rs8110887 [19] YANG Chuan, WANG Zheng- hong. An ensemble Wasserstein generative adversarial network method for road extraction from high resolution remote sensing images in rural areas[J]. IEEE Access, 2020, 8: 174317-174324. doi: 10.1109/ACCESS.2020.3026084 [20] CHEN Zheng-chao, LU Kai-xuan, GAO Lian-ru, et al. Automatic detection of track and fields in China from high-resolution satellite images using multi-scale-fused single shot multibox detector[J]. Remote Sensing, 2019, 11(11): 1377. doi: 10.3390/rs11111377 [21] JI Shun-ping, YU Da-wen, SHEN Chao-yong, et al. Landslide detection from an open satellite imagery and digital elevation model dataset using attention boosted convolutional neural networks[J]. Landslides, 2020, 17(6): 1337-1352. doi: 10.1007/s10346-020-01353-2 [22] GONG Peng, LI Xue-cao, ZHANG Wei. 40-year (1978- 2017) human settlement changes in China reflected by impervious surfaces from satellite remote sensing[J]. Science Bulletin, 2019, 64(11): 756-763. doi: 10.1016/j.scib.2019.04.024 [23] TONG Xin-yi, LU Qi-kai, XIA Gui-song. Large-scale land cover classification in GaoFen-2 satellite imagery[C]//IEEE. 38th Annual IEEE International Geoscience and Remote Sensing Symposium. New York: IEEE, 2018: 3599-3602. [24] WU Qiong, ZHONG Ruo-fei, ZHAO Wen-ji, et al. Land-cover classification using GF-2 images and airborne lidar data based on Random Forest[J]. International Journal of Remote Sensing, 2019, 40(5/6): 2410-2426. doi: 10.1080/01431161.2018.1483090 [25] REDMON J, DIVVALA S, GIRSHICK R, et al. You only look once: unified, real-time object detection[C]//IEEE. 29th IEEE Conference on Computer Vision and Pattern Recognition. New York: IEEE, 2016: 779-788. [26] REDMON J, FARHADI A. YOLO9000: better, faster, stronger[C]//IEEE. 30th IEEE Conference on Computer Vision and Pattern Recognition. New York: IEEE, 2017: 6517-6525. [27] REDMON J, FARHADI A. YOLO v3: an incremental improvement[R]. Ithaca: Cornell University, 2018. [28] LIU Wei, ANGUELOV D, ERHAN D, et al. SSD: single shot multibox detector[C]//Springer. 14th European Conference on Computer Vision. Berlin: Springer, 2016: 21-37. [29] HE Kai-ming, ZHANG Xiang-yu, REN Shao-qing, et al. Spatial pyramid pooling in deep convolutional networks for visual recognition[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015, 37(9): 1904-1916. doi: 10.1109/TPAMI.2015.2389824 [30] GIRSHICK R. Fast R-CNN[C]//IEEE. 15th IEEE International Conference on Computer Vision. New York: IEEE, 2015: 1440-1448. [31] GIRSHICK R, DONAHUE J, DARRELL T, et al. Rich feature hierarchies for accurate object detection and semantic segmentation[C]//IEEE. 27th IEEE Conference on Computer Vision and Pattern Recognition. New York: IEEE, 2014, 580-587. [32] LIN T Y, DOLLÁR P, GIRSHICK R, et al. Feature pyramid networks for object detection[C]//IEEE. 2017 IEEE Conference on Computer Vision and Pattern Recognition. New York: IEEE, 2017: 936-944. [33] SIMONYAN K, ZISSERMAN A. Very deep convolutional networks for large-scale image recognition[C]//ICLR. 3rd International Conference on Learning Representations. New Orleans: ICLR, 2015: 1-14. [34] 王俊强, 李建胜, 周学文, 等. 改进的SSD算法及其对遥感影像小目标检测性能的分析[J]. 光学学报, 2019, 39(6): 0628005. https://www.cnki.com.cn/Article/CJFDTOTAL-GXXB201906044.htmWANG Jun-qiang, LI Jian-sheng, ZHOU Xue-wen, et al. Improved SSD algorithm and its performance analysis of small target detection in remote sensing images[J]. Acta Optica Sinica, 2019, 39(6): 0628005. (in Chinese) https://www.cnki.com.cn/Article/CJFDTOTAL-GXXB201906044.htm [35] ZHAI She-ping, SHANG Ding-rong, WANG Shu-huan, et al. DF-SSD: an improved SSD object detection algorithm based on DenseNet and feature fusion[J]. IEEE Access, 2020, 8: 24344-24357. doi: 10.1109/ACCESS.2020.2971026 -