智能体记忆引导的学习与决策: 海马体记忆回放的视角

Memory-guided learning and decision-making of agent:A perspective from memory replay of hippocampus

下载PDF

导出

摘要生物体记忆回放对提高其学习和决策能力有重要作用.研究表明,生物体记忆回放主要是由位于海马体内的位置细胞完成的,在回放激活顺序和具体激活位置上具有多样性,但是现有模拟海马体记忆回放研究方法大多形式单一,只模拟了单方向或者部分情形下的回放,难以较好地复现海马体记忆回放机理.因此,结合生物体记忆回放机理,多方面模拟海马体位置细胞的记忆回放功能来提高智能体的学习与决策性能,具有重要的研究价值和应用前景.针对静态栅格场景,本文通过使用组合的强化学习机制来模拟海马体重新激活的多样性,设计了一种轨迹采样和优先扫描两个过程相互交替使用的双向搜索模型,来模拟海马体不同位置记忆的再激活,同时,通过在线学习和离线学习的方式分别模拟生物体清醒和睡眠状态下的记忆机理,更好地复现海马体的记忆回放过程.进一步地,针对变化的动态场景,设计具有“一套参数,两段更新”功能的深度双向搜索模型,来提高智能体动态环境下的学习与决策性能.复杂静态和动态栅格环境下智能体导航实验以及与其他强化学习算法的性能对比实验验证了本文所提模型的有效性. Memory replay plays an important role in improving learning and decision-making ability of organisms.Studies have shown that biology memory playback is mainly conducted by place cells in the hippocampus,on the playback activation sequence and speciﬁc activation positions diversity.Unfortunately,most of the existing researches of simulated hippocampus replay have single forms and only the replay in one direction or part of the case are simulated,which is difﬁcult to well reproduce the hippocampus memory replay mechanism.Therefore,combining the memory playback mechanism of organisms,it is of great research value and application prospects to simulate and realize the memory playback of the hippocampal place cells,to improve the learning and decision-making performance of agents.For the static grid scenario,a combined reinforcement learning mechanism is used to simulate the diversity of the hippocampal reactivation.In this work,a bi-directional search model is designed to simulate the memory reactivation at different locations in the hippocampus by alternate use of the trajectory sampling and priority sweeping.Meanwhile,online and off-line learning is used to simulate the memory mechanism of the organism in awake and sleep statues respectively,so as to better reproduce the memory playback process of the hippocampus.Furthermore,a deep bi-directional search model with the function of“one set of parameters and two updates”is designed to enhance the learning and decision-making performance of agents in dynamic environments.Finally,agent navigation experiments in complex static and dynamic grid environments and performance comparison experiments with other reinforcement learning algorithms verify the effectiveness of the proposed model.

作者朱觐镳吴一帆王东署 ZHU Jin-biao;WU Yi-fan;WANG Dong-shu(School of Electrical and Information Engineering,Zhengzhou University,Zhengzhou Henan 450001,China;Innovation Center of Intelligent Systems,Longmen Laboratory,Luoyang Henan 471000,China)

机构地区郑州大学电气与信息工程学院龙门实验室智能系统科创中心

出处《控制理论与应用》 EI CAS CSCD 北大核心 2024年第10期1753-1764,共12页 Control Theory & Applications

基金国家自然科学基金项目(62173309,61873245)资助.

关键词记忆引导决策海马体记忆回放轨迹采样优先扫描 memory-guided decision-making hippocampus memory replay trajectory sampling prioritized sweeping

分类号 G63 [文化科学—教育学]

引文网络
相关文献

1赵姗,谷佳媚.文化记忆再生产:履行新时代文化使命的新维度[J].文化遗产,2023(4):48-57. 被引量：4
2张晓平,李凯,王力,闫佳庆,何忠贺.一种具有情感和记忆机制的迷宫机器人认知模型[J].控制与决策,2023,38(10):2850-2858.
3刘莹,严传魁.基于能量场的海马定位与导航模型研究[J].交叉科学快报,2024,8(2):137-145.
4吴恒.城市更新的思路及实施路径研究——以福州长乐城市更新功能片区为例[J].城市建筑,2024,21(21):70-74.
5中央和国家机关工委举办学习贯彻党的二十届三中全会精神和习近平总书记重要指示精神专题读书班交流分享会[J].旗帜,2024(10):21-21.
6王伟,王勇,张晔,项贺.智能消毒机器人移动平台导航系统研究[J].机械设计与制造,2024(11):346-350.
7何林,邓燕.大学生在线学习体验影响因素及教学优化研究[J].吉林农业科技学院学报,2024,33(5):50-53.
8陈磊,解全颖.移动视觉搜索研究现状与展望[J].西南民族大学学报（人文社会科学版）,2024,45(8):233-240.
9焦长权,陈锋.群众路线与政治整合:社会领域党的建设的路径研究[J].复印报刊资料（中国共产党）,2024(7):140-147.
10梁俊逸.储气库增压技术现状及发展趋势[J].化工管理,2024(31):103-105.

控制理论与应用

2024年第10期

浏览历史

内容加载中请稍等...

智能体记忆引导的学习与决策: 海马体记忆回放的视角

相关作者

相关机构

相关主题

浏览历史