摘要
行为决策系统能够综合环境及自车信息,使自动驾驶车辆产生安全合理的驾驶行为,是实现无人驾驶的核心。强化学习算法采用一种自监督学习的方式,使自动驾驶车辆的决策系统在与环境的交互过程中,通过不断改进自身策略自主学习到最优的决策模型,为构建有效的决策系统提供了方向。文中总结了近年来基于强化学习的行为决策方法在提高决策精度、提高决策广度以及应对不确定因素等方面的研究进展。决策精度的提升主要依赖于引入具有强大表征能力的深度学习技术。决策广度的提升得益于能够通过任务分解以缓解维数灾难的分层抽象技术。不确定因素则通过部分可观测马尔科夫决策过程被纳入考量之中以提高行车安全。
The decision-making system can integrate environment and ego vehicle information,so that the autonomous vehicle produces safe and reasonable driving behavior,which is the core technology to realize the autonomous driving.Reinforcement learning algorithm adopts a self-supervised learning method,so that the decision-making system of autonomous vehicles can autonomously learn the optimal decision model through continuous improvement of its strategy during the interaction with the environment,which provides a direction for building an effective decision-making system.This study summarizes the research progress in recent years of the decision-making method based on reinforcement learning in terms of improving decision accuracy,improving decision-making breadth,and dealing with uncertain factors.The improvement of decision-making accuracy mainly depends on the introduction of deep learning algorithm with strong representation ability and the hierarchical abstraction technology that can decompose complex tasks to alleviate the dimension disaster.The uncertainty is considered by partially observable Markov decision process to improve driving safety.
作者
张佳鹏
李琳
朱叶
ZHANG Jiapeng;LI Lin;ZHU Ye(School of Optical-Electrical and Computer Engineering,University of Shanghai for Science and Technology,Shanghai 200000,China)
出处
《电子科技》
2021年第5期66-71,共6页
Electronic Science and Technology
基金
国家自然科学基金(61673277)。
关键词
无人驾驶
强化学习
行为决策
自监督学习
策略改进
决策精度
决策广度
不确定因素
autonomous driving
reinforcement learning
decision-making
self-monitoring learning
strategy improvement
decision accuracy
decision breadth
uncertainty