Volume 25 Issue 1
Feb.  2025
Turn off MathJax
Article Contents
XIAO Jian-li, QIU Xue, ZHANG Yang, SU Hai-sheng, LI Zhi-peng, ZHANG Chuan-ming. Review on large language models in transportation[J]. Journal of Traffic and Transportation Engineering, 2025, 25(1): 8-28. doi: 10.19818/j.cnki.1671-1637.2025.01.002
Citation: XIAO Jian-li, QIU Xue, ZHANG Yang, SU Hai-sheng, LI Zhi-peng, ZHANG Chuan-ming. Review on large language models in transportation[J]. Journal of Traffic and Transportation Engineering, 2025, 25(1): 8-28. doi: 10.19818/j.cnki.1671-1637.2025.01.002

Review on large language models in transportation

doi: 10.19818/j.cnki.1671-1637.2025.01.002
Funds:

National Natural Science Foundation of China 92370201

National Natural Science Foundation of China 61603257

Fundamental Research Funds for the Central Universities 22120230311

More Information
  • Corresponding author: XIAO Jian-li(1982-), male, professor, PhD, audyxiao@sjtu.edu.cn
  • Received Date: 2024-07-25
  • Publish Date: 2025-02-25
  • The promotion of large language models(LLMs) in transportation was further discussed. Their great potentials were demonstrated in improving traffic management and control, enhancing traffic safety, and advancing autonomous driving. The basic concepts and development of LLMs, large vision models, and large multimodal models were systematically expounded. Some LLMs in transportation were summarized in terms of their structures and training methods. The major applications of LLMs in transportation were discussed, such as traffic management and control, traffic safety, and autonomous driving. Research results show that, when it comes to traffic management and control, issues such as traffic signal control and traffic state prediction can be significantly addressed by the application of LLMs. New possibilities are also brought for urban traffic management. Traffic congestion and environmental pollution are both reduced. As for traffic safety, compared with previous models, LLMs application significantly improve the ability to analyze and predict traffic accidents. Through deep learning of historical accident data, the models can identify those areas and time periods with a high incidence of accidents. Consequently, preventive measures can be taken to improve the traffic safety index. In the field of autonomous driving, the shift from traditional models to multimodal autonomous driving models can not only enhance the abilities of decision-making and environmental adaptation, but also provide users with a safer and more comfortable driving experience. The potential and value of LLMs in transportation are explored. Besides, practical suggestions are also offered to create a more intelligent and efficient transportation system, such as reducing the computational cost of LLMs in transportation and improving the real-time performance and reliability of models.

     

  • loading
  • [1]
    DIMITRAKOPOULOS G, DEMESTICHAS P. Intelligent transportation systems[J]. IEEE Vehicular Technology Magazine, 2010, 5(1): 77-84.
    [2]
    LIN Yang-xin, WANG Ping, MA Meng. Intelligent transportation system (ITS): concept, challenge and opportunity[C]//IEEE. 2017 IEEE 3rd International Conference on Big Data Security on Cloud (Bigdatasecurity), IEEE International Conference on High Performance and Smart Computing (HPSC), and IEEE International Conference on Intelligent Data and Security (IDS). New York: IEEE, 2017: 167-172.
    [3]
    OUYANG L, WU J, JIANG X, et al. Training language models to follow instructions with human feedback[C]//ACM. Advances in Neural Information Processing Systems. New York: ACM, 2022: 27730-27744.
    [4]
    ACHIAM J, ADLER S, AGARWAL S, et al. GPT-4 technical report[R]. San Francisco: OpenAI, 2023.
    [5]
    ZHANG Si-yao, FU Dao-cheng, LIANG Wen-zhe, et al. TrafficGPT: viewing, processing and interacting with traffic foundation models[J]. Transport Policy, 2024, 150: 95-105.
    [6]
    ZHOU Xing-cheng, LIU Ming-yu, YURTSEVER E, et al. Vision language models in autonomous driving and intelligent transportation systems[J]. arXiv, 2023, DOI: 10.48550/arXiv.2310.14414.
    [7]
    SHOAIB M R, EMARA H M, ZHAO Jun. A survey on the applications of frontier AI, foundation models, and large language models to intelligent transportation systems[C]//IEEE. 2023 International Conference on Computer and Applications (ICCA). New York: IEEE, 2023: 1-7.
    [8]
    CUI Can, MA Yun-sheng, CAO Xue, et al. A survey on multimodal large language models for autonomous driving[C]//IEEE. 2024 IEEE/CVF Winter Conference on Applications of Computer Vision. New York: IEEE, 2024: 958-979.
    [9]
    ZHENG Ou, ABDEL-ATY M, WANG Dong-dong, et al. ChatGPT is on the horizon: could a large language model be suitable for intelligent traffic safety research and applications?[J]. arXiv, 2023, DOI: 10.48550/arXiv.2303.05382.
    [10]
    CUI Can, MA Yun-sheng, CAO Xu, et al. Receive, reason, and react: drive as you say, with large language models in autonomous vehicles[J]. IEEE Intelligent Transportation Systems Magazine, 2024, 16(4): 81-94. doi: 10.1109/MITS.2024.3381793
    [11]
    MCCORDUCK P, CFE C. Machines Who Think: a Personal Inquiry into the History and Prospects of Artificial Intelligence[M]. Natick: A.K. Peters, 2004.
    [12]
    LECUN Y, BOTTOU L, BENGIO Y, et al. Gradient-based learning applied to document recognition[J]. Proceedings of the IEEE, 1998, 86(11): 2278-2324. doi: 10.1109/5.726791
    [13]
    CRESWELL A, WHITE T, DUMOULIN V, et al. Generative adversarial networks: an overview[J]. IEEE Signal Processing Magazine, 2018, 35(1): 53-65. doi: 10.1109/MSP.2017.2765202
    [14]
    VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all you need[C]//MIT Press. Proceedings of the 31st International Conference on Neural Information Processing Systems. Massachusetts: MIT Press, 2017: 6000-6010.
    [15]
    DEVLIN J, CHANG Ming-wei, LEE K, et al. BERT: pre-training of deep bidirectional transformers for language understanding[J]. arXiv, 2018, DOI: 10.48550/arXiv.1810.04805.
    [16]
    RADFORD A, NARASIMHAN K, SALIMANS T, et al. Improving language understanding by generative pre-training[R]. San Francisco: OpenAI, 2018.
    [17]
    BROWN T, MANN B, RYDER N, et al. Language models are few-shot learners[C]//ACM. Advances in Neural Information Processing Systems. New York: ACM, 2020: 1877-1901.
    [18]
    CHOWDHERY A, NARANG S, DEVLIN J, et al. PaLM: scaling language modeling with pathways[J]. Journal of Machine Learning Research, 2023, 24(240): 11342-11436.
    [19]
    TAYLOR R, KARDAS M, CUCURULL G, et al. Galactica: a large language model for science[J]. arXiv, 2022, DOI: 10.48550/arXiv.2211.09085.
    [20]
    TOUVRON H, LAVRIL T, IZACARD G, et al. LLaMA: open and efficient foundation language models[J]. arXiv, 2023, DOI: 10.48550/arXiv.2302.13971.
    [21]
    WEI J, TAY Y, BOMMASANI R, et al. Emergent abilities of large language models[J]. arXiv, 2022, DOI: 10.48550/arXiv.2206.07682.
    [22]
    SANH V, WEBSON A, RAFFEL C, et al. Multitask prompted training enables zero-shot task generalization[J]. arXiv, 2021, DOI: 10.48550/arXiv.2110.08207.
    [23]
    WEI J, WANG Xue-zhi, SCHUURMANS D, et al. Chain-of-thought prompting elicits reasoning in large language models[J]. arXiv, 2022, DOI: 10.48550/arXiv.2201.11903.
    [24]
    TAO Chao-fan, LIU Qian, DOU Long-xu. Scaling laws with vocabulary: larger models deserve larger vocabularies[C]// MIT Press. Proceedings of the 38th International Conference on Neural Information Processing Systems. Massachusetts: MIT Press, 2024: 1-33.
    [25]
    HOFFMANN J, BORGEAUD S, MENSCH A, et al. Training compute-optimal large language models[J]. arXiv, 2022, DOI: 10.48550/arXiv.2203.15556.
    [26]
    ZHAO W X, ZHOU Kun, LI Jun-yi, et al. A survey of large language models[J]. arXiv, 2023, DOI: 10.48550/arXiv.2303.18223.
    [27]
    OQUAB M, DARCET T, MOUTAKANNI T, et al. DINOv2: learning robust visual features without supervision[J]. arXiv, 2023, DOI: 10.48550/arXiv.2304.07193.
    [28]
    CARON M, TOUVRON H, MISRA I, et al. Emerging properties in self-supervised vision transformers[C]//IEEE. 2021 IEEE/CVF International Conference on Computer Vision(ICCV). New York: IEEE, 2021: 9630-9640.
    [29]
    ZHOU Jing-hao, WEI Chen, WANG Hui-yu, et al. iBOT: image BERT pre-training with online tokenizer[J]. arXiv, 2021, DOI: 10.48550/arXiv.2111.07832.
    [30]
    RADFORD A, KIM J W, HALLACY C, et al. Learning transferable visual models from natural language supervision[C]//PMLR. International Conference on Machine Learning. New York: PMLR, 2021: 8748-8763.
    [31]
    TUO Yu-xiang, XIANG Wang-meng, HE Jun-yan, et al. AnyText: multilingual visual text generation and editing[J]. arXiv, 2023, DOI: 10.48550/arXiv.2311.03054.
    [32]
    HO J, JAIN A, ABBEEL P. Denoising diffusion probabilistic models[C]//MIT Press. Proceedings of the 33rd International Conference on Neural Information Processing Systems. Massachusetts: MIT Press, 2020: 6840-6851.
    [33]
    MA Jian, ZHAO Ming-jun, CHEN Chen, et al. GlyphDraw: seamlessly rendering text with intricate spatial structures in text-to-image Generation[J]. arXiv, 2023, DOI: 10.48550/arXiv.2303.17870.
    [34]
    CHEN Jing-ye, HUANG Yu-pan, LYU Teng-chao, et al. Textdiffuser: diffusion models as text painters[C]//MIT Press. Proceedings of the 37th International Conference on Neural Information Processing Systems. Massachusetts: MIT Press, 2023: 1-35.
    [35]
    JIANG Yu-ming, WU Tian-xing, YANG Shuai, et al. Videobooth: diffusion-based video generation with image prompts[C]//IEEE. 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). New York: IEEE, 2024: 6689-6700.
    [36]
    RUIZ N, LI Yuan-zhen, JAMPANI V, et al. DreamBooth: fine tuning text-to-image diffusion models for subject-driven generation[C]//IEEE. 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). New York: IEEE, 2023: 22500-22510.
    [37]
    WANG Yi, LI Kun-chang, LI Xin-hao, et al. Computer Vision-ECCV 2024[M]. Berlin: Springer International Publishing, 2024.
    [38]
    ZHAO Long, GUNDAVARAPU N B, YUAN Liang-zhe, et al. Videoprism: a foundational visual encoder for video understanding[J]. arXiv, 2024, DOI: 10.48550/arXiv.2402.13217.
    [39]
    LIU Ye, LI Si-yuan, WU Yang, et al. Umt: unified multi-modal transformers for joint video moment retrieval and highlight detection[C]//IEEE. 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE, 2022: 3032-3041.
    [40]
    ZHU De-yao, CHEN Jun, SHEN Xiao-qian, et al. MiniGPT-4: enhancing vision-language understanding with advanced large language models[J]. arXiv, 2023, DOI: 10.48550/arXiv.2304.10592.
    [41]
    CHIANG W L, LI Z, LIN Z, et al. Vicuna: an open-source chatbot impressing GPT-4 with 90%* ChatGPT quality[R/OL]. 2023, https://vicuna.lmsys.org.
    [42]
    DOSOVITSKIY A, BEYER L, KOLESNIKOV A, et al. An image is worth 16x16 words: transformers for image recognition at scale[J]. arXiv, 2020, DOI: 10.48550/arXiv.2010.11929.
    [43]
    SHARMA P, DING Nan, GOODMAN S, et al. Conceptual captions: a cleaned, hypernymed, image alt-text dataset for automatic image captioning[C]//USAACL. Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, Stroudsburg: USAACL, 2018: 2556-2565.
    [44]
    ORDONEZ V, KULKARNI G, BERG T. Im2text: describing images using 1 million captioned photographs[C]//ACM. Proceedings of the 25th International Conference on Neural Information Processing Systems. New York: ACM, 2011: 1143-1151.
    [45]
    SCHUHMANN C, VENCU R, BEAUMONT R, et al. Laion-400m: open dataset of CLIP-filtered 400 million image-text pairs[J]. arXiv, 2021, DOI: 10.48550/arXiv.2111.02114.
    [46]
    ZANG Yu-hang, LI Wei, HAN Jun, et al. Contextual object detection with multimodal large language models[J]. arXiv, 2023, DOI: 10.48550/arXiv.2305.18279.
    [47]
    CARION N, MASSA F, SYNNAEVE G, et al. Computer Vision-ECCV 2020[M]. Berlin: Springer International Publishing, 2020.
    [48]
    HE K M, GKIOXARI G, DOLLáR P, et al. Mask R-CNN[C]//IEEE. 2017 IEEE International Conference on Computer Vision(ICCV). New York: IEEE, 2017: 2980-2988.
    [49]
    YANG Zheng-yuan, LI Lin-jie, LIN K, et al. The dawn of LMMs: preliminary explorations with GPT-4V(ision)[J]. arXiv, 2023, DOI: 10.48550/arXiv.2309.17421.
    [50]
    ANIL R, BORGEAUD S, ALAYRAC J B, et al. Gemini: a family of highly capable multimodal models[J]. arXiv, 2023, DOI: 10.48550/arXiv.2312.11805.
    [51]
    HENDRYCKS D, BURNS C, BASART S, et al. Measuring massive multitask language understanding[J]. arXiv preprint, 2020, DOI: 10.48550/arXiv.2009.03300.
    [52]
    DONG Xiao-yi, ZHANG Pan, ZANG Yu-hang, et al. InternLM-XComposer2-4KHD: a pioneering large vision-language model handling resolutions from 336 pixels to 4KHD[J]. arXiv, 2024, DOI: 10.48550/arXiv.2404.06512.
    [53]
    MATHEW M, KARATZAS D, JAWAHAR C V. DocVQA: a dataset for VQA on document images[C]//IEEE. 2021 IEEE Winter Conference on Applications of Computer Vision(WACV). New York: IEEE, 2021: 2200-2209.
    [54]
    MASRY A, LONG Do X, TAN J Q, et al. ChartQA: a benchmark for question answering about charts with visual and logical reasoning[C]//USAACL. Findings of the Association for Computational Linguistics: ACL 2022. Stroudsburg: USAACL, 2022: 2263-2279.
    [55]
    SINGH A, NATARAJAN V, SHAH M, et al. Towards VQA models that can read[C]//IEEE. 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE, 2019: 8317-8326.
    [56]
    ROHRBACH A, HENDRICKS L A, BURNS K, et al. Object hallucination in image captioning[J]. arXiv, 2018, DOI: 10.48550/arXiv.1809.02156.
    [57]
    LIU Yu-liang, LI Zhang, HUANG Ming-xin, et al. OCRBench: on the hidden mystery of OCR in large multimodal models[J]. arXiv, 2023, DOI: 10.48550/arXiv.2305.07895.
    [58]
    CONTRIBUTORS O C. Opencompass: a universal evaluation platform for foundation models[R]. GitHub Repository, 2023.
    [59]
    BAI Jin-ze, BAI Shuai, YANG Shu-sheng, et al. Qwen-VL: a versatile vision-language model with versatile abilities[J]. arXiv, 2023, DOI: 10.48550/arXiv.2308.12966.
    [60]
    WANG Wei-han, LYU Qing-song, YU Wen-meng, et al. CogVLM: visual expert for pretrained language models[J]. arXiv, 2023, DOI: 10.48550/arXiv.2311.03079.
    [61]
    YOUNG A, CHEN Bei, LI Chao, et al. Yi: open foundation models by 01. ai[J]. arXiv, 2024, DOI: 10.48550/arXiv.2403.04652.
    [62]
    WANG Peng, WEI Xiang, HU Fang-xu, et al. TransGPT: multi-modal generative pre-trained transformer for transportation[C]//IEEE. 2024 International Conference on Computational Linguistics and Natural Language Processing (CLNLP). New York: IEEE, 2024: 96-100.
    [63]
    DU Zheng-xiao, QIAN Yu-jie, LIU Xiao, et al. GLM: general language model pretraining with autoregressive blank infilling[C]//ACL. Proceedings of the 60th Annual Meeting of the Association for Computational linguistics. Stroudsburg: ACL, 2022: 320-335.
    [64]
    LI Zhong-hang, XIA Liang-hao, TANG Jia-bin, et al. UrbanGPT: spatio-temporal large language models[J]. arXiv, 2024, DOI: 10.48550/arXiv.2403.00813.
    [65]
    GUAN Wei-sheng, XIAO Jian-li. A review on parameters prediction of traffic flow by combining spatio-temporal features[J]. Journal of University of Shanghai for Science and Technology, 2022, 44(6): 592-602.
    [66]
    LONG Bai-chao, GUAN Wei-sheng, XIAO Jian-li. Spatio-temporal traffic flow prediction method based on data encoding and decoding[J]. Journal of University of Shanghai for Science and Technology, 2023, 45(2): 120-127.
    [67]
    YU Bing, YIN Hao-teng, ZHU Zhan-xing. Spatio-temporal graph convolutional networks: a deep learning framework for traffic forecasting[J]. arXiv, 2017, DOI: 10.48550/arXiv.1709.04875.
    [68]
    BAI Lei, YAO Li-na, LI Can, et al. Adaptive graph convolutional recurrent network for traffic forecasting[C]//ACM. Proceedings of the 34th International Conference on Neural Information Processing Systems. New York: ACM, 2020: 17804-17815.
    [69]
    YUAN Yuan, DING Jing-tao, FENG Jie, et al. UniST: a prompt-empowered universal model for urban spatio-temporal prediction[C]//ACM. Proceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining. New York: ACM, 2024: 4095-4106.
    [70]
    ZHANG Jun-bo, ZHENG Yu, QI De-kang. Deep spatio-temporal residual networks for citywide crowd flows prediction[C]//ACM. Proceedings of the AAAI Conference on Artificial Intelligence. New York: ACM, 2017: 1655-1661.
    [71]
    LIU Ling-bo, ZHANG Rui-mao, PENG Jie-feng, et al. Attentive crowd flow machines[C]//ACM. Proceedings of the 26th ACM International Conference on Multimedia. New York: ACM, 2018: 1553-1561.
    [72]
    JIN K H, WI J A, LEE E J, et al. TrafficBERT: pre-trained model with large-scale data for long-range traffic flow forecasting[J]. Expert Systems with Applications, 2021, 186: 115738. doi: 10.1016/j.eswa.2021.115738
    [73]
    CAESAR H, BANKITI V, LANG A H, et al. NuScenes: a multimodal dataset for autonomous driving[C]//IEEE. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). New York: IEEE, 2020: 11621-11631.
    [74]
    NEUHOLD G, OLLMANN T, BULÒ S R, et al. The mapillary vistas dataset for semantic understanding of street scenes[C]//IEEE. 2017 IEEE International Conference on Computer Vision(ICCV). New York: IEEE, 2017: 5000-5009.
    [75]
    SONG Xi-bin, WANG Peng, ZHOU Ding-fu, et al. ApolloCar3D: a large 3D car instance understanding benchmark for autonomous driving[C]//IEEE. 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR). New York: IEEE, 2019: 5447-5457.
    [76]
    ROS G, SELLART L, MATERZYNSKA J, et al. The SYNTHIA dataset: a large collection of synthetic images for semantic segmentation of urban scenes[C]//IEEE. 2016 IEEE Conference on Computer Vision and Pattern Recognition. New York: IEEE, 2016: 3234-3243.
    [77]
    BEHRENDT K, NOVAK L, BOTROS R. A deep learning approach to traffic lights: detection, tracking, and classification[C]//IEEE. 2017 IEEE International Conference on Robotics and Automation (ICRA). New York: IEEE, 2017: 1370-1377.
    [78]
    STALLKAMP J, SCHLIPSING M, SALMEN J, et al. The German traffic sign recognition benchmark: a multi-class classification competition[C]//IEEE. The 2011 International Joint Conference on Neural Networks. New York: IEEE, 2011: 1453-1460.
    [79]
    ZHU Zhe, LIANG Dun, ZHANG Song-hai, et al. Traffic-sign detection and classification in the wild[C]//IEEE. 2016 IEEE Conference on Computer Vision and Pattern Recognition. New York: IEEE, 2016: 2110-2118.
    [80]
    LIN T Y, MAIRE M, BELONGIE S, et al. Computer Vision-ECCV 2014[M]. Berlin: Springer International Publishing, 2014.
    [81]
    WEN Long-yin, DU Da-wei, CAI Zhao-wei, et al. UA-DETRAC: a new benchmark and protocol for multi-object detection and tracking[J]. Computer Vision and Image Understanding, 2020, 193: 102907. doi: 10.1016/j.cviu.2020.102907
    [82]
    SOCHOR J, HEROUT A, HAVEL J. BoxCars: 3D boxes as CNN input for improved fine-grained vehicle recognition[C]//IEEE. 2016 IEEE Conference on Computer Vision and Pattern Recognition. New York: IEEE, 2016: 3006-3015.
    [83]
    DA Long-chao, GAO Min-quan, MEI Hao, et al. Prompt to transfer: sim-to-real transfer for traffic signal control with prompt learning[J]. arXiv, 2023, DOI: 10.48550/arXiv.2308.14284.
    [84]
    QIN Yan-yan, LUO Qin-zhong, HE Zheng-bing. Management and control method of dedicated lanes for mixed traffic flows with connected and automated vehicles[J]. Journal of Traffic and Transportation Engineering, 2023, 23(3): 221-231 doi: 10.19818/j.cnki.1671-1637.2023.03.017
    [85]
    HANNA J, STONE P. Grounded action transformation for robot learning in simulation[C]//ACM. Proceedings of the AAAI Conference on Artificial Intelligence. New York: ACM, 2017: 4931-4932.
    [86]
    LAI Si-qi, XU Zhao, ZHANG Wei-jia, et al. Large language models as traffic signal control agents: capacity and opportunity[J]. arXiv, 2023, DOI: 10.48550/arXiv.2312.16044.
    [87]
    LIU Chen-xi, YANG Sun, XU Qian-xiong, et al. Spatial-temporal large language model for traffic prediction[J]. arXiv, 2024, DOI: 10.48550/arXiv.2401.10134.
    [88]
    HU Zuo-an, DENG Jin-cheng, HAN Jin-li, et al. Review on application of graph neural network in traffic prediction[J]. Journal of Traffic and Transportation Engineering, 2023(5): 39-61. doi: 10.19818/j.cnki.1671-1637.2023.05.003
    [89]
    RADFORD A, WU J, CHILD R, et al. Language models are unsupervised multitask learners[EB/OL]. (2020-09-18)[2024-12-01], https://insightcivic.s3.us-east-1.amazonaws.com/language-models.pdf.
    [90]
    TOUVRON H, MARTIN L, STONE K, et al. LLaMA 2: open foundation and fine-tuned chat models[J]. arXiv, 2023, DOI: 10.48550/arXiv.2307.09288.
    [91]
    ZHOU Jie, CUI Gan-qu, HU Sheng-ding, et al. Graph neural networks: a review of methods and applications[J]. AI Open, 2020, 1: 57-81. doi: 10.1016/j.aiopen.2021.01.001
    [92]
    WU Bing, WANG Wen-xuan, LI Lin-bo, et al. Longitudinal control model for connected autonomous vehicles influenced by multiple preceding vehicles[J]. Journal of Traffic and Transportation Engineering, 2020, 20(2): 184-194. doi: 10.19818/j.cnki.1671-1637.2020.02.015
    [93]
    ZHENG O, ABDEL-ATY M, WANG D D, et al. TrafficSafetyGPT: tuning a pre-trained large language model to a domain-specific expert in transportation safety[J]. arXiv, 2023, DOI: 10.48550/arXiv.2307.15311.
    [94]
    JONGWIRIYANURAK N, ZENG Z C, WANG M H, et al. Framework for motorcycle risk assessment using onboard panoramic camera (short paper)[C]//Roger B, Dianna S, Sarah W, et al. Leibniz International Proceedings in Informatics (LIPIcs). Leeds: Schloss Dagstuhl-Leibniz-Zentrum für Informatik, 2023: 44: 1-44: 7.
    [95]
    LIU Hao-tian, LI Chun-yuan, WU Qing-yang, et al. Visual instruction tuning[C]// MIT Press. Proceedings of the 37th International Conference on Neural Information Processing Systems. Massachusetts: MIT Press, 2023: 1-25.
    [96]
    ARTEAGA C, PARK J W. A large language model framework to uncover underreporting in traffic crashes[J]. Journal of Safety Research, 2023, 92: 1-13.
    [97]
    TAY Y, DEHGHANI M, TRAN V Q, et al. UL2: unifying language learning paradigms[J]. arXiv, 2022, DOI: 10.48550/arXiv.2205.05131.
    [98]
    WANG Le-ning, REN Yi-long, JIANG Han, et al. AccidentGPT: accident analysis and prevention from V2X environmental perception with multi-modal large model[J]. arXiv, 2023, DOI: 10.48550/arXiv.2312.13156.
    [99]
    XU Run-sheng, TU Zheng-zhong, XIANG Hao, et al. CoBEVT: cooperative bird's eye view semantic segmentation with sparse transformers[J]. arXiv, 2022, DOI: 10.48550/arXiv.2207.02202.
    [100]
    MEHR E, JOURDAN A, THOME N, et al. DiscoNet: shapes learning on disconnected manifolds for 3D editing[C]//IEEE. Proceedings of the IEEE/CVF International Conference on Computer Vision. New York: IEEE, 2019: 3473-3482.
    [101]
    XU Run-sheng, XIANG Hao, TU Zheng-zhong, et al. Computer vision - ECCV 2022[M]. Berlin: Springer International Publishing, 2022.
    [102]
    WANG Run-min, ZHU Yu, ZHAO Xiang-mo, et al. Research progress on test scenario of autonomous driving[J]. Journal of Traffic and Transportation Engineering, 2021, 21(2): 21-37. doi: 10.19818/j.cnki.1671-1637.2021.02.003
    [103]
    WANG Xiao-feng, ZHU Zheng, HUANG Guan, et al. DriveDreamer: towards real-world-driven world models for autonomous driving[J]. arXiv, 2023, DOI: 10.48550/arXiv.2309.09777.
    [104]
    KIM S W, PHILION J, TORRALBA A, et al. DriveGAN: towards a controllable high-quality neural simulation[C]//IEEE. 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE, 2021: 5816-5825.
    [105]
    JIN Y, SHEN X, PENG H, et al. SurrealDriver: designing generative driver agent simulation framework in urban contexts based on large language model[J]. arXiv, 2023, DOI: 10.48550/arXiv.2309.13193.
    [106]
    WEN Li-cheng, FU Dao-cheng, LI Xin, et al. DiLu: a knowledge-driven approach to autonomous driving with large language models[J]. arXiv, 2023, DOI: 10.48550/arXiv.2309.16292.
    [107]
    XI Z, SUKTHANKAR G. A graph representation for autonomous driving[C]//MIT Press. Proceedings of the 36th International Conference on Neural Information Processing Systems. Massachusetts: MIT Press, 2022: 1-11.
    [108]
    XU Zhen-hua, ZHANG Yu-jia, XIE En-ze, et al. DriveGPT4: interpretable end-to-end autonomous driving via large language model[J]. IEEE Robotics and Automation Letters, 2004, 9: 8186-8193.
    [109]
    JIN Bu, LIU Xin-yu, ZHENG Yu-peng, et al. ADAPT: action-aware driving caption transformer[C]//IEEE. 2023 IEEE International Conference on Robotics and Automation (ICRA). New York: IEEE, 2023: 7554-7561.
    [110]
    MAO Jia-geng, QIAN Yu-xi, YE Jun-jie, et al. GPT-Driver: learning to drive with GPT[J]. arXiv, 2023, DOI: 10.48550/arXiv.2310.01415.
    [111]
    WANG Yu-qi, HE Jia-wei, FAN Lue, et al. Driving into the future: multiview visual forecasting and planning with world model for autonomous driving[J]. arXiv, 2023, DOI: 10.48550/arXiv.2311.17918.
    [112]
    JIANG Bo, CHEN Shao-yu, XU Qing, et al. VAD: vectorized scene representation for efficient autonomous driving[C]//IEEE. 2023 IEEE/CVF International Conference on Computer Vision. New York: IEEE, 2023: 8340-8350.
    [113]
    GAO Rui-yuan, CHEN Kai, XIE En-ze, et al. Magicdrive: street view generation with diverse 3D geometry control[J]. arXiv, 2023, DOI: 10.48550/arXiv.2310.02601.
    [114]
    YANG Kai-rui, MA En-hui, PENG Ji-bin, et al. BEVControl: accurately controlling street-view elements with multi-perspective consistency via bev sketch layout[J]. arXiv, 2023, DOI: 10.48550/arXiv.2308.01661.
    [115]
    MA Yun-sheng, CUI Can, CAO Xu, et al. LaMPilot: an open benchmark dataset for autonomous driving with language model programs[C]//IEEE. 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). New York: IEEE, 2024: 15141-15151.
    [116]
    TREIBER M, HENNECKE A, HELBING D. Congested traffic states in empirical observations and microscopic simulations[J]. Scientific Reports, 2000, 62(2): 1805-1824.
    [117]
    KESTING A, TREIBER M, HELBING D. General lane-changing model MOBIL for car-following models[J]. Transportation Research Record, 2007, 1999(1): 86-94.
    [118]
    SHAO Hao, HU Yu-xuan, WANG Le-tian, et al. LMDrive: closed-loop end-to-end driving with large language models[C]//IEEE. 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). New York: IEEE, 2024: 15120-15130.
    [119]
    WANG Wen-hai, XIE Jiang-wei, HU Chuan-yang, et al. DriveMLM: aligning multi-modal large language models with behavioral planning states for autonomous driving[J]. arXiv, 2023, DOI: 10.48550/arXiv.2312.09245.
    [120]
    FAN Hao-yang, ZHU Fan, LIU Chang-chun, et al. Baidu apollo em motion planner[J]. arXiv, 2018, DOI: 10.48550/arXiv.1807.08048.
    [121]
    SHAO Hao, WANG Le-tian, CHEN Ruo-bing, et al. Safety-enhanced autonomous driving using interpretable sensor fusion transformer[C]//PMLR. Conference on Robot Learning. New York: PMLR, 2023: 726-737.
    [122]
    SIMA Chong-hao, RENZ K, CHITTA K, et al. DriveLM: driving with graph visual question answering[J]. arXiv, 2023, DOI: 10.48550/arXiv.2312.14150.
    [123]
    HU Yi-han, YANG Jia-zhi, CHEN Li, et al. Planning-oriented autonomous driving[C]//IEEE. 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE, 2023: 17853-17862.
    [124]
    HAN Bing-ye, DU Zeng-ming, DAI Lei, et al. Modeling the dynamic performance of transportation infrastructure using panel data model in state-space specifications[J]. Journal of Traffic and Transportation Engineering (English Edition), 2023, 10(3): 441-453.
    [125]
    OLAYODE O I, DU B, SEVERINO A, et al. Systematic literature review on the applications, impacts, and public perceptions of autonomous vehicles in road transportation system[J]. Journal of Traffic and Transportation Engineering (English Edition), 2023, 10(6): 1037-1060.

Catalog

    Article Metrics

    Article views (83) PDF downloads(20) Cited by()
    Related

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return