期刊文献+

自动驾驶开源数据体系:现状与未来

Open-sourced data ecosystem in autonomous driving:the present and future
原文传递
导出
摘要 随着自动驾驶技术的不断成熟与应用,系统性梳理开源自动驾驶数据集有利于产业生态良性循环.现有自动驾驶数据集可大致分为两代,第一代数据集的传感模态复杂度相对较低、数据集规模相对较小,且大多局限于感知级任务,以发布于2012年的KITTI为代表.相比于第一代数据集,第二代数据集的特征为传感模态复杂度较高、数据集规模与多样性较丰富、所设置任务从感知扩展到预测、规控上,以2019年前后提出的nuScenes,Waymo为代表.本文联合学术界、产业界同仁,首次系统性梳理了国内外70余种开源自动驾驶数据集,对如何构建高质量数据集、数据在算法闭环体系中发挥的核心作用、如何利用生成式大模型规模化生产数据等进行了总结.此外,就未来第三代自动驾驶数据集应该具备的特质和数据规模,以及需要解决的科学与技术问题,进行了详细分析与讨论.希望本文的归纳与展望能促进新一代自动驾驶数据集与生态体系的建设、推动关键领域自主原创与科技自强的发展. With the continuous maturation and application of autonomous driving technology,a systematic examination of open-source autonomous driving datasets becomes instrumental in fostering the robust evolution of the industry ecosystem.Current autonomous driving datasets can broadly be categorized into two generations.The first-generation autonomous driving dataset is characterized by relatively simpler sensor modalities,a smaller dataset scale,and a limitation to perception-level tasks.KITTI,introduced in 2012,serves as a prominent representative of this initial wave.In contrast,the second-generation datasets exhibit heightened complexity in sensor modalities,greater dataset scale and diversity,and an expansion of tasks from perception to encompass prediction and control.Leading examples of the second generation include nuScenes and Waymo,introduced around 2019.This comprehensive review,conducted in collaboration with esteemed colleagues from both academia and industry,systematically assesses over seventy open-source autonomous driving datasets from domestic and international sources.It offers insights into various aspects,such as the principles underlying the creation of highquality datasets,the pivotal role of data within algorithmic closed-loop systems,and the utilization of generative foundation models to facilitate scalable data generation.Furthermore,this review undertakes an exhaustive analysis and discourse regarding the characteristics and data scales that future third-generation autonomous driving datasets should possess.It also delves into the scientific and technical challenges that warrant resolution.The synthesis and perspectives presented in this article provide valuable guidance for the development of a novel generation of autonomous driving datasets and ecosystems.These endeavors are pivotal in advancing autonomous innovation and fostering technological enhancement in critical domains.
作者 李弘扬 李阳 王晖杰 曾嘉 徐慧琳 蔡品隆 陈立 严骏驰 徐丰 熊璐 王井东 朱福堂 许春景 汪天才 夏飞 穆北鹏 彭志辉 林达华 乔宇 Hongyang LI;Yang LI;Huijie WANG;Jia ZENG;Huilin XU;Pinlong CAI;Li CHEN;Junchi YAN;Feng XU;Lu XIONG;Jingdong WANG;Futang ZHU;Chunjing XU;Tiancai WANG;Fei XIA;Beipeng MU;Zhihui PENG;Dahua LIN;Yu QIAO(Shanghai AI Lab,Shanghai 200232,China;Department of Computer Science and Engineering,Shanghai Jiao Tong University,Shanghai 200240,China;School of Information Science and Technology,Fudan University,Shanghai 200433,China;College of Automotive Studies,Tongji University,Shanghai 200092,China;Baidu,Beijing 100085,China;BYD Auto,Shenzhen 518118,China;Huawei,Shenzhen 518129,China;MEGVII Technology,Beijing 100096,China;Meituan,Beijing 100102,China;AGIBOT,Shanghai 201315,China)
出处 《中国科学:信息科学》 CSCD 北大核心 2024年第6期1283-1318,共36页 Scientia Sinica(Informationis)
基金 科技创新2030“新一代人工智能”重大项目(批准号:2022ZD0160104) 国家自然科学基金青年项目(批准号:62206172) 国家自然科学基金重大研究计划重点项目(批准号:92370201) 国家自然科学基金优秀青年项目(批准号:62222607) 上海市启明星计划(批准号:22QA1412500) 中国博士后科学基金(批准号:2023M741848) 上海市青年科技英才扬帆计划(批准号:23YF1462000)资助项目。
关键词 自动驾驶 数据算法闭环 基础模型 数据集与挑战赛 autonomous driving data pipeline foundation model dataset and challenge
  • 相关文献

参考文献4

二级参考文献1

共引文献86

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部