期刊文献+

未知环境下基于深度序列蒙特卡罗树搜索的信源导航方法 被引量:1

DS-MCTS:A Deep Sequential Monte-Carlo Tree Search Method for Source Navigation in Unknown Environments
下载PDF
导出
摘要 信源导航在应急救援、工业巡检及其他危险作业中具有重要应用意义.在实际应用中,环境的状态信息往往是难以完全观测的,即部分可观测环境.如何利用观测到的部分环境信息做出实时决策,并基于历史序列信息对系统未来状态进行有效的预测,成为信源导航相关研究所面临的挑战性问题.本文提出一种基于深度序列蒙特卡洛树搜索(Deep Sequential Monte-Carlo Tree Search,DS-MCTS)的信源导航算法和系统框架,基于序列动作预测(Sequential Action Prediction,SAP)网络为MCTS决策提供先验知识,构建奖励分配预测(Reward Allocation Prediction,RAP)网络提高奖励分配精度,最终实现系统的最优化决策.仿真实验表明,DS-MCTS方法提供了一种端到端的信源导航解决方案,可以实现智能体动作的有效预测,实现高效、鲁棒的路径规划. Source navigation has important application significance in emergency rescue,industrial patrol,and other dangerous operations.In practical applications,it is often difficult to fully observe the state information of the environment,that is,a partially observable environment.Making real-time decisions using part of the observed environmental information and effectively predicting the system’s future state based on the historical sequence information have become a challenge faced by research institutes related to source navigation.This paper proposes a source navigation algorithm and system framework based on deep sequential Monte-Carlo tree search(DS-MCTS).Prior knowledge is provided to MCTS decision-making based on a sequential action prediction(SAP)network.A reward allocation prediction(RAP)network is built to improve the accuracy of reward distribution and finally realize the system’s optimal decision-making.The simulation results show that the DS-MCTS method provides an end-to-end source navigation solution,which can effectively predict agents’actions and achieve efficient and robust path planning.
作者 段世红 何昊 徐诚 殷楠 王然 DUAN Shi-hong;HE Hao;XU Cheng;YIN Nan;WANG Ran(School of Computer and Communication Engineering,University of Science and Technology Beijing,Beijing 100083,China;Shunde Graduate School,University of Science and Technology Beijing,Foshan,Guangdong 528399,China)
出处 《电子学报》 EI CAS CSCD 北大核心 2022年第7期1744-1752,共9页 Acta Electronica Sinica
基金 国家自然科学基金(No.62101029) 博士后创新人才支持计划(No.BX20190033) 广东省基础与应用基础研究基金联合基金(No.2019A1515110325) 中国博士后基金面上项目(No.2020M670135) 北京科技大学顺德研究生院博士后科研经费(No.2020BH001) 中央高校基本科研业务费(No.06500127)。
关键词 信源导航 蒙特卡洛树搜索 序贯决策 路径规划 深度强化学习 source navigation Monte-Carlo tree search sequential decision-making path planning deep reinforcement learning
  • 相关文献

参考文献11

二级参考文献75

共引文献173

同被引文献12

引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部