摘要
端到端自动驾驶算法的开发现已成为当前自动驾驶技术研发的热点。经典的强化学习算法利用车辆状态、环境反馈等信息训练车辆行驶,通过试错学习获得最佳策略,实现了端到端的自动驾驶算法开发,但仍存在开发效率低下的问题。为解决虚拟仿真环境下训练强化学习算法的低效性和高复杂度问题,本文提出了一种异步分布式强化学习框架,并建立了进程间和进程内的多智能体并行柔性动作-评价(soft actor-critic,SAC)分布式训练框架,加速了Carla模拟器上的在线强化学习训练。同时,为进一步实现模型的快速训练和部署,本文提出了一种基于Cloud-OTA的分布式模型快速训练和部署系统架构,系统框架主要由空中下载技术(over-the-air technology,OTA)平台、云分布式训练平台和车端计算平台组成。在此基础上,本文为了提高模型的可复用性并降低迁移部署成本,搭建了基于ROS的Autoware-Carla集成验证框架。实验结果表明,本文方法与多种主流自动驾驶方法定性相比训练速度更快,能有效地应对密集交通流道路工况,提高了端到端自动驾驶策略对未知场景的适应性,减少在实际环境中进行实验所需的时间和资源。
The development of end-to-end autonomous driving algorithms has become a hot topic in current autonomous driving technology research and development.Classic reinforcement learning algorithms leverage information such as vehicle state and environmental feedback to train the vehicle for driving,through trial-and-error learning to obtain the best strategy,so as to achieve the development of end-to-end autonomous driving algorithms.However,there is still the problem of low development efficiency.The article proposes an asynchronous distributed reinforcement learning framework to address the inefficiency and high complexity problems in training RL algorithms in virtual simulation environment,establishes intra and inter process multi-agent parallel Soft Actor-Critic(SAC)distributed training framework on the Carla simulator to accelerate online RL training.Additionally,to achieve rapid model training and deployment,the article proposes a distributed model training and deployment system architecture based on Cloud-OTA,which mainly consists of an Over-the-Air Technology(OTA)platform,a cloud-based distributed training platform,and an on-vehicle computing platform.On this basis,the paper establishes an Autoware-Carla integrated validation framework based on ROS to improve model reusability and reduce migration and deployment cost.The experimental results show that compared with various mainstream autonomous driving methods,the method proposed in this paper has a faster training speed qualitatively,which can effectively copewith dense traffic flow and improve the adaptability of end-to-end autonomous driving strategies to unknown scenes,and reduce the time and resources required for experimentation in actual environment.
作者
刘卫国
项志宇
刘伟平
齐道新
王子旭
Liu Weiguo;Xiang Zhiyu;Liu Weiping;Qi Daoxin;Wang Zixu(School of Information and Electronic Engineering,Zhejiang University,Hangzhou 310058;National Innovation Center of Intelligent and Connected Vehicles,Beijing 100160)
出处
《汽车工程》
EI
CSCD
北大核心
2023年第9期1637-1645,共9页
Automotive Engineering
基金
自动驾驶国家新一代人工智能开放创新平台项目(2020AAA0103702)资助。
关键词
强化学习
分布式
多智能体
自动驾驶
Carla
车辆控制
reinforcement learning
distributed system
multi-agent
autonomous driving
Carla
vehicle control