摘要
大模型驱动的具身智能是涵盖人工智能、机器人学和认知科学的交叉领域,重点研究如何将大模型的感知、推理和逻辑思维能力与具身智能相结合,提升现有模仿学习、强化学习、模型预测控制等具身智能框架的数据效率和泛化能力.近年来,随着大模型能力的不断提升,以及具身智能中示教数据、仿真平台、任务集合的不断完善,大模型和具身智能的结合将成为人工智能的下一个浪潮,有望成为人工智能迈向实体机器人的重要突破口.本文围绕大模型驱动的具身智能这一研究领域,从3个方面进行了系统的调研、分析和展望.首先,回顾了大模型和具身智能的相关技术背景,以及具身智能现有的学习框架.其次,按照大模型赋能具身智能的方式,将现有研究分为大模型驱动的环境感知、大模型驱动的任务规划、大模型驱动的基础策略、大模型驱动的奖励函数、大模型驱动的数据生成等5类范式.最后,总结了大模型驱动的具身智能中存在的挑战,对可行的技术路线进行展望,为相关研究人员提供参考,进一步推动国家人工智能发展战略.
Embodied artificial intelligence(AI)driven by large-scale models is a cross-disciplinary field covering AI,robotics,and cognitive science,focusing on how to combine the perception,reasoning,and logical thinking abilities of large-scale models with embodied AI to improve the data efficiency and generalization ability of existing embodied AI frameworks such as imitation learning,reinforcement learning,and model predictive control.In recent years,with the continuous improvement of the capabilities of large-scale models and the continuous improvement of expert datasets,simulation platforms,and task sets in embodied robots,the combination of large-scale models and embodied AI will become the next wave of AI and is expected to become an important breakthrough for AI to move towards physical robots.This article focuses on the research field of embodied AI driven by large-scale foundation models(LFM),conducting systematic research,analysis,and prospects.Firstly,we review the relevant technical backgrounds of large models and embodied intelligence,as well as the existing learning frameworks of embodied intelligence.Secondly,according to how large models empower embodied intelligence,we divide the existing research into five paradigms:LFM-driven environmental perception,LFMdriven task planning,LFM-driven basic strategy,LFM-driven reward function,and LFM-driven data generation.Finally,we summarize the challenges in existing research,look forward to feasible technical routes,provide references for researchers,and further promote the national AI development strategy.
作者
白辰甲
许华哲
李学龙
Chenjia BAI;Huazhe XU;Xuelong LI(Institute of Arti cial Intelligence(TeleAI),China Telecom Corp.Ltd.,Shanghai 200232,China;Institute of Arti cial Intelligence(TeleAI),China Telecom Corp.Ltd.,Beijing 100033,China;Institute for Interdisciplinary Information Sciences,Tsinghua University,Beijing 100084,China)
出处
《中国科学:信息科学》
CSCD
北大核心
2024年第9期2035-2082,共48页
Scientia Sinica(Informationis)
基金
国家自然科学基金(批准号:61871470,62306242)资助项目。
关键词
具身智能
大模型
环境感知
任务规划
基础策略
embodied AI
large-scale models
environment perception
task planning
foundation policy