Bus passenger flow classification prediction driven by CNN-GRU model and multi-source data
-
摘要: 为精准分析公交线路与站点不同客流的出行特征及时变差异性,结合深度学习理论,提出了一种基于卷积神经网络(CNN)与门控制循环单元(GRU)组合的公交客流分类预测模型;融合匹配公交一卡通刷卡、公交车GPS轨迹、线路和站点基础信息、气象等多源数据,实现公交客流数据重构;采用K-Medians算法将乘客分为通勤类和非通勤类;以乘客类型、历史客流量、时段、高/平峰、星期、降水量、重大活动等因素为输入向量,分别建立CNN与GRU单一模型,并利用均方误差、均方根误差、平均绝对误差为评价指标,开展预测;针对单一模型不适用多特征时间序列预测等问题,分别构建了由CNN和GRU组合的线路客流和断面客流预测模型;以北京市特15路公交为例,预测工作日与非工作日场景下的线路及断面的分类客流。分析结果表明:对于通勤类和非通勤类线路及断面客流,组合模型的均方误差相比单一模型平均降低了57.932、13.106和33.987,均方根误差平均降低了1.862、1.058和1.538,平均绝对误差平均降低了1.399、0.487和0.613,可见,多源数据驱动下的CNN-GRU组合模型具有良好的预测性能。Abstract: To accurately analyze the trip characteristics and time-varying differences of different passenger flows of bus routes and stops, combined with deep learning theory, a bus passenger flow classification prediction model based on a combination of a convolutional neural network (CNN) and gated recurrent unit (GRU) was proposed. By integrating and matching multi-source data, such as bus card swiping, bus global positioning system (GPS) trajectory, route and station basic information, and weather data, bus passenger flow data was reconstructed. The K-medians algorithm was used to divide passengers into commuter and non-commuter categories. Taking the factors of passenger type, historical passenger flow, time period, high/flat peak, week, precipitation, and major events as input vectors, a single model of CNN and GRU was established, and forecasts were conducted using mean squared error (MSE), root mean squared error (RMSE), and mean absolute error (MAE) as evaluation indicators. As a single model is not suitable for multi-feature time series forecasting, line passenger flow and cross-section passenger flow prediction models combined with a CNN and GRU were constructed. Taking Beijing Special 15 Bus as an example, the classified passenger flows of routes and cross-sections under the scenarios of working days and non-working days were predicted. Analysis results show that for commuter and non-commuter routes and cross-section passenger flows, the MSEs of the combined model reduce by 57.932, 13.106, and 33.987 on average, the RMSEs reduce by 1.862, 1.058, and 1.538 on average, and the MAEs reduce by 1.399, 0.487, and 0.613 on average, respectively. Thus, the CNN-GRU combined model driven by multi-source data has a good prediction performance. 3 tabs, 7 figs, 36 refs.
-
表 1 原始刷卡数据字段名
Table 1. Field name of original swipe data
序号 字段名 中文名 具体说明 1 GRAND_CARD_CODE 一卡通卡号 2 LINE_CODE 线路编号 3 ON_STATION 上车站点 4 OFF_STATION 下车站点 5 DEAL_TIME 交易时间 第2次刷卡时间 6 DEAL_TYPE 交易类型 06为正常交易 7 CARD_TYPE 卡类型 1为普通卡;
18为老人卡;
19为学生卡等8 RUN_COMP_CODE 公交运行公司编号 9 VEHICLE_CODE 车辆编号 10 DRIVER_CODE 驾驶人编号 表 2 公交客流主要影响因素
Table 2. Main influencing factors of bus passenger flow
变量 影响因素 数据类型 数据范围 B1 乘客类型 数值型 依据客流分类方案定 B2 降水量 降水量 B3 历史客流量 依据预测时间粒度选用不同历史客流量 B4 时段 一天24 h B5 高/平峰 早平峰为0;早高峰为1;午平峰为2;晚高峰为3;晚平峰为4 B6 星期 1、2、…、7 B7 重大活动与突发事件 客流聚集为0;客流疏散为1 表 3 ARIMA、CNN、GRU、CNN-GRU模型预测误差
Table 3. Prediction errors of ARIMA, CNN, GRU, CNN-GRU models
短时公交客流 模型 MSE RMSE MAE 未分类客流 ARIMA模型[36] 573.294 23.944 16.537 CNN模型 456.194 21.359 13.375 GRU模型 422.867 20.564 12.986 CNN-GRU组合模型 324.453 18.013 10.734 第1类客流 ARIMA模型 365.236 19.111 13.963 CNN模型 282.373 16.804 10.290 GRU模型 259.577 16.111 9.975 CNN-GRU组合模型 213.043 14.596 8.734 第2类客流 ARIMA模型 61.205 7.823 4.341 CNN模型 46.448 6.815 3.169 GRU模型 43.735 6.613 3.403 CNN-GRU组合模型 31.986 5.656 2.799 最大断面客流 ARIMA模型 261.279 16.164 9.824 CNN模型 144.721 12.030 7.296 GRU模型 133.987 11.575 6.862 CNN-GRU组合模型 105.367 10.265 6.466 -
[1] SINGHAL A, KAMGA C, YAZICI A. Impact of weather on urban transit ridership[J]. Transportation Research Part A: Policy and Practice, 2014, 69(69): 379-391. [2] 林小稳, 叶霞飞. 基于天气因素的城市轨道交通接驳客流预测改进模型[J]. 城市公共交通, 2014(7): 26-29. doi: 10.3969/j.issn.1009-1467.2014.07.011LIN Xiao-wen, YE Xia-fei. An improved model of urban rail transit access passenger flow forecasting considering weather impact[J]. Urban Public Transport, 2014(7): 26-29. (in Chinese) doi: 10.3969/j.issn.1009-1467.2014.07.011 [3] 刘欣彤. 降雨天气条件下短时公交客流预测研究[D]. 哈尔滨: 哈尔滨工业大学, 2016.LIU Xin-tong. Research on short-term bus passenger demand for casting under rainy weather conditions[D]. Harbin: Harbin Institute of Technology, 2016. (in Chinese) [4] CUI Cheng-liang, ZHAO Ya-li, DUAN Zheng-yu. Research on the stability of public transit passenger travel behavior based on smart card data[C]//MA Jian-ming, YIN Ya-feng, HUANG He-lai, et al. COTA International Conference of Transportation Professionals. Reston: ASCE, 2014: 1318-1326. [5] 赵晋. 基于精细化人群分类的公交路径选择模型研究[D]. 北京: 北京工业大学, 2017.ZHAO Jin. Research on public transport route selection model based on meticulous population classification[D]. Beijing: Beijing University of Technology, 2017. (in Chinese) [6] KIEU L M, BHASKAR A, CHUNG E. Passenger segmentation using smart card data[J]. IEEE Transaction on Intelligent Transportation Systems, 2015, 16(3): 1537-1548. doi: 10.1109/TITS.2014.2368998 [7] 何兆成, 余畅, 许敏行. 考虑出行模式和周期性的公交出行特征分析[J]. 交通运输系统工程与信息, 2016, 16(6): 135-141. doi: 10.3969/j.issn.1009-6744.2016.06.021HE Zhao-cheng, YU Chang, XU Min-xing. Analyzing methods of residents' travel characteristics considering travel patterns and periodicity[J]. Journal of Transportation Systems Engineering and Information Technology, 2016, 16(6): 135-141. (in Chinese) doi: 10.3969/j.issn.1009-6744.2016.06.021 [8] XU Wei, QIN Yong, HUANG Hou-kuan. A new method of railway passenger flow forecasting based on spatio-temporal data mining[C]//IEEE. Proceedings of the 7th International IEEE Conference on Intelligent Transportation Systems. New York: IEEE, 2004: 402-405. [9] 顾杨, 韩印, 方雪丽. 基于ARMA模型的公交枢纽站客流量预测方法研究[J]. 交通信息与安全, 2011, 29(2): 5-9. doi: 10.3963/j.ISSN1674-4861.2011.02.002GU Yang, HAN Yin, FANG Xue-li. Method of hub station passenger flow forecasting based on ARMA model[J]. Journal of Transport Information and Safety, 2011, 29(2): 5-9. (in Chinese) doi: 10.3963/j.ISSN1674-4861.2011.02.002 [10] 赵阳阳, 夏亮, 江欣国. 基于经验模态分解与长短时记忆神经网络的短时地铁客流预测模型[J]. 交通运输工程学报, 2020, 20(4): 194-204. doi: 10.19818/j.cnki.1671-1637.2020.04.016ZHAO Yang-yang, XIA Liang, JIANG Xin-guo. Short-term metro passenger flow prediction based on EMD-LSTM[J]. Journal of Traffic and Transportation Engineering, 2020, 20(4): 194-204. (in Chinese) doi: 10.19818/j.cnki.1671-1637.2020.04.016 [11] WANG Hai-zhong, LIU Lu, DONG Shang-jia, et al. A novel work zone short-term vehicle-type specific traffic speed prediction model through the hybrid EMD-ARIMA framework[J]. Transportmetrica B: Transport Dynamics, 2016, 4(3): 159-186. doi: 10.1080/21680566.2015.1060582 [12] 雷定猷, 马强, 徐新平, 等. 基于非线性主成分分析和GA-RBF的高速公路交通量预测方法[J]. 交通运输工程学报, 2018, 18(3): 210-217. doi: 10.3969/j.issn.1671-1637.2018.03.022LEI Ding-you, MA Qiang, XU Xin-ping, et al. Forecasting method of expressway traffic volume based on NPCA and GA-RBF[J]. Journal of Traffic and Transportation Engineering, 2018, 18(3): 210-217. (in Chinese) doi: 10.3969/j.issn.1671-1637.2018.03.022 [13] TONG Gang, FAN Chun-ling, CUI Feng-ying, et al. Fuzzy neural network model applied in the traffic flow prediction[C]//IEEE. The 2006 IEEE International Conference on Information Acquisition. New York: IEEE, 2006: 1229-1233. [14] SUKENS J K, BRABANLER J D, LUKAS L, et al. Weighted least squares support vector machines: robustness and sparse approximation[J]. Neurocomputing, 2002, 48 (1): 85-105. [15] BAR-GERA H, BOYCE D. Origin-based algorithms for combined travel forecasting models[J]. Transportation Research Part B: Methodological, 2003, 37(5): 405-422. doi: 10.1016/S0191-2615(02)00020-6 [16] ZHAO Shu-zhi, NI Tong-he, WANG Yang, et al. A new approach to the prediction of passenger flow in a transit system[J]. Computers and Mathematics with Application, 2010, 61(8): 1968-1974. [17] ZHANG Da-fu, ZHANG Xin-ming, WANG Jian. Commuter travel identification based on bus IC data[J]. Procedia Social and Behavioral Sciences, 2013, 96: 1547-1555. doi: 10.1016/j.sbspro.2013.08.176 [18] 王月玥. 基于多源数据的公共交通通勤出行特征提取方法研究[D]. 北京: 北京工业大学, 2014.WANG Yue-yue. Research on methods of extracting commuting trip characteristic based on public transportation multi-source data[D]. Beijing: Beijing University of Technology, 2014. (in Chinese) [19] 柳伍生, 周向栋, 匡凯. 基于IC卡数据的公交下车站点区间不确定性客流推导方法[J]. 铁道科学与工程学报, 2018, 15(11): 2988-2994. https://www.cnki.com.cn/Article/CJFDTOTAL-CSTD201811032.htmLIU Wu-sheng, ZHOU Xiang-dong, KUANG Kai. The method of deriving passenger flow of bus alighting stops based on smart card data and interval uncertainty[J]. Journal of Railway Science and Engineering, 2018, 15(11): 2988-2994. (in Chinese) https://www.cnki.com.cn/Article/CJFDTOTAL-CSTD201811032.htm [20] AKIMA H. A new method of interpolation and smooth curve fitting based on local procedures[J]. Journal of the ACM, 1970, 17(4): 589-602. doi: 10.1145/321607.321609 [21] CERVERO R. Alternative approaches to modeling the travel-demand impacts of smart growth[J]. Journal of the American Planning Association, 2006, 72(3): 285-295. doi: 10.1080/01944360608976751 [22] KEEMIN S, HUNJIN S. Factors generating boarding at metro station in the Seoul metropolitan area[J]. Cities, 2010, 27(5): 358-368. doi: 10.1016/j.cities.2010.05.001 [23] 杨军, 侯忠生. 一种基于灰色马尔科夫的大客流实时预测模型[J]. 北京交通大学学报, 2013, 37(2): 119-123, 128. doi: 10.3969/j.issn.1673-0291.2013.02.022YANG Jun, HOU Zhong-sheng. A grey Markov based on large passenger flow real-time prediction model[J]. Journal of Beijing Jiaotong University, 2013, 37(2): 119-123, 128. (in Chinese) doi: 10.3969/j.issn.1673-0291.2013.02.022 [24] 张旭. 基于精细化用地数据的城市轨道交通客流预测[D]. 北京: 北京交通大学, 2019.ZHANG Xu. Urban rail transit passenger flow forecast based on refined land use data[D]. Beijing: Beijing Jiaotong University, 2019. (in Chinese) [25] LIU Guo-jin, YIN Zhen-zhi, JIA Yun-jian, et al. Passenger flow estimation based on convolutional neural network in public transportation system[J]. Knowledge-Based Systems, 2017, 123: 102-115. doi: 10.1016/j.knosys.2017.02.016 [26] 李令先. 基于CNN的轨道交通拥堵预测算法研究[D]. 成都: 成都理工大学, 2019.LI Ling-xian. Research on traffic congestion prediction algorithms based on CNN[D]. Chengdu: Chengdu University of Technology, 2019. (in Chinese) [27] DAI Guo-wei, MA Chang-xi, XU Xue-cai. Short-term traffic flow prediction method for urban road sections based on space-time analysis and GRU[J]. IEEE Access, 2019, 7(1): 143025-143035. [28] ZHAO Jian-dong, GAO Yuan, QU Yue-cai, et al. Travel time prediction: based on gated recurrent unit method and data fusion[J]. IEEE Access, 2018, 6(1): 70463-70472. [29] NGUYEN T, NGUYEN G, NGUYEN B M. EO-CNN: an enhanced CNN model trained by equilibrium optimization for traffic transportation prediction[J]. Procedia Computer Science, 2020, 176: 800-809. doi: 10.1016/j.procs.2020.09.075 [30] XIONG Li-yan, ZHANG Lei, HUANG Xiao-hui, et al. DCAST: a spatiotemporal model with DenseNet and GRU based on attention mechanism[J]. Mathematical Problems in Engineering, 2021, DOI: 10.1155/2021/8867776. [31] LYU Yi-Sheng, DUAN Yan-Jie, KANG Wen-wen, et al. Traffic flow prediction with big data: a deep learning approach[J]. IEEE Transactions on Intelligent Transportation Systems, 2015, 16(2): 865-873. [32] ZHAO Jian-dong, GAO Yuan, BAI Zhi-ming, et al. Traffic speed prediction under non-recurrent congestion: based on LSTM method and BeiDou navigation satellite system data[J]. IEEE Intelligent Transportation Systems Magazine, 2019, 11(2): 70-81. doi: 10.1109/MITS.2019.2903431 [33] ZHAO Jian-dong, WU Hong-qiang, CHEN Liang-liang. Road surface state recognition based on SVM optimization and image segmentation processing[J]. Journal of Advanced Transportation, 2017, https://doi.org/10.1155/2017/6458495. [34] GUO Dong-ning, WU Yi-hong, SHAMAI S, et al. Estimation in Gaussian noise: properties of the minimum mean-square error[J]. IEEE Transactions on Information Theory, 2011, 57(4): 2371-2385. doi: 10.1109/TIT.2011.2111010 [35] WILLMOTT C, MATSUURA K. Advantages of the mean absolute error(MAE) over the root mean square error (RMSE) in assessing average model performance[J]. Climate Research, 2005, 30(1): 79-82. [36] ZHAO Jian-dong, GAO Yuan, GUO Yu-jie, et al. Travel time prediction of expressway based on multi-dimensional data and PSO-ARMAX model[J]. Advances in Mechanical Engineering, 2018, 10(2): 1-16. -