摘要
无人机面对非结构化未知环境,如山地和丛林等场景进行探索时,必须在缺乏先验条件的情况下同时进行环境感知和航迹规划。传统方法受制于算法和传感器等多重因素的制约,探索范围有限,效率低下,并易受到环境变化的干扰。为解决这一问题,提出了一种基于深度强化学习的无人机自主探索方法。该方法以归一化优势函数(Normalized Advantage Functions,NAF)算法为基础,引入了3种算法增强机制,以提升无人机在非结构化未知环境中的探索范围和效率。在自行设计的仿真环境中进行实验,结果表明,改进后的NAF算法相较于原始版本,具有更大的探索范围和更高的效率,同时表现出优越的收敛性和鲁棒性。
Faced with unstructured and unknown environments,such as exploring in mountains and jungles,UAVs must simultaneously perform environment sensing and trajectory planning in the absence of a priori conditions.Traditional methods are constrained by multiple factors such as algorithms and sensors,resulting in limited exploration range,low efficiency,and susceptibility to interference from environmental changes.To solve this problem,this study proposes an autonomous exploration method for UAVs based on deep reinforcement learning.The method is based on the normalized advantage functions(NAF)algorithm and introduces three algorithmic enhancement mechanisms to improve the exploration range and efficiency of UAVs in unstructured and unknown environments.By conducting experiments in a self-designed simulation environment,the results of simulation experiments and analysis show that the improved NAF algorithm has a larger exploration range and higher efficiency compared to the original version,while exhibiting superior convergence and robustness.
作者
唐嘉宁
李成阳
周思达
马孟星
施炀
TANG Jianing;LI Chengyang;ZHOU Sida;MA Mengxing;SHI Yang(School of Electrical and Information Technology,Yunnan Minzu University,Kunming 650031,China;Yunnan Key Laboratory of Unmanned Autonomous System,Kunming 650031,China;Institute of Unmanned Autonomous Systems,Yunnan Minzu University,Kunming 650031,China)
出处
《计算机科学》
CSCD
北大核心
2024年第S02期144-149,共6页
Computer Science
基金
国家自然科学基金(61963038,62063035)。
关键词
无人机自主探索
智能决策
深度强化学习
NAF算法
增强机制
Autonomous UAV exploration
Intelligent decision making
Deep reinforcement learning
NAF algorithm
Augmentation mechanism